Case study

Collective Memory

AI-powered social mobile MVP with image tagging and vector-based discovery

Social content platform · October 2024

Visit live site

Overview

Collective Memory was an AI-powered social mobile MVP built around image-based content, creator interaction, and semantic discovery.

The goal was not just to let users upload and browse images, but to create a product where uploaded content could be automatically understood, enriched, and surfaced through semantic discovery instead of basic keyword matching.

I built the initial MVP, with much of the complexity centered in the backend, infrastructure, and AI pipeline rather than only the React Native client.

The challenge

The core challenge was designing a system that could scale beyond a simple media feed.

The product needed to support potentially hundreds of thousands of uploaded images while keeping two things fast: the upload pipeline and the search experience.

For an early-stage product, that mattered a lot. If upload processing takes too long or related-image search feels slow, the experience breaks down quickly and the product loses its value.

That meant the architecture had to do more than store files and metadata. It had to process each image, extract useful meaning, make it searchable, and return related content quickly enough to feel smooth on mobile.

Search was not exact text matching: the model had to find closest related content by embedding proximity, not only tags or keywords — the kind of nearest-neighbor retrieval vector indexes are built for.

Need this kind of thinking earlier in the process? Explore my Technical Discovery for Web Apps service.

Choosing the stack

A big part of the MVP was choosing the right technical foundation early.

I chose the stack for speed, reliability, and a realistic path to shipping. The goal was not to over-engineer a custom search platform from scratch, but to get a functional AI discovery engine into a real product quickly.

For the mobile app, I used React Native so the team could move quickly on the first version of the product.

For the backend and API layer, I used Node.js, Express, MongoDB, and AWS for infrastructure and storage buckets.

For the AI layer, I used Google Vertex AI. Multimodal embeddings were a strong fit for turning image-related input into representations that could be indexed for related-content lookup. Alternatives worth evaluating at the time included Pinecone and Weaviate, both oriented toward indexed similarity workflows.

I also wrote about the most expensive mistake in software projects (and why teams miss it), including what I keep seeing when the real problem was never pinned down.

The architecture

The product pipeline looked like this:

1. User uploads an image — The image is uploaded from the mobile client into the backend pipeline and stored in cloud storage.

2. AI analyzes the image — The uploaded image is sent through an AI analysis step, where prompt-driven logic is used to generate useful descriptive tags and semantic signals.

3. The image is embedded — Tags and image semantics are turned into embeddings via Vertex AI’s multimodal APIs so content isn’t limited to exact text matches.

4. Embeddings are indexed — Representations land in a similarity-search layer for efficient nearest-neighbor lookup. Google’s Vector Search is built for high-scale, low-latency retrieval.

5. Search returns related content — When a user interacts with an image or runs a related-content query, the system returns the closest matches from the index.

This felt smarter than upload-and-tag alone: users could explore by related meaning, not only manually entered labels.

What I built

For the initial MVP, I built the core technical foundation for:

  • React Native mobile app flows
  • Node.js / Express backend APIs
  • image upload and cloud storage flow
  • AI-powered image tagging pipeline
  • embeddings generation flow
  • vector-based similarity retrieval
  • MongoDB-backed metadata and app data
  • infrastructure for the first usable product version

The heavy lifting was not only on the mobile UI. A lot of the real work was in making ingestion, enrichment, storage, and retrieval reliable enough to stay fast as data volume grew.

Why this was technically interesting

This project was interesting because it sat at the edge of what was practical to build in 2024.

It combined mobile product development, backend architecture, AI-powered image understanding, embeddings, and vector-based retrieval at a time when these workflows were still relatively new in real product environments.

A lot of MVPs are basically CRUD apps with a nicer front end. This one was different because the product value depended on whether the intelligence behind the pipeline actually worked in practice.

The challenge was not just getting the pieces connected. It was choosing technologies and shaping an architecture that could support fast uploads, useful tagging, and responsive similarity search without turning the product into an over-engineered science project.

That made the technical decisions unusually important for an MVP. The novelty was part of the challenge.

My role

I built the initial MVP and designed the core architecture behind the first version of the product.

That included choosing the technical approach, building the backend/API layer, integrating the AI image pipeline, structuring the upload and retrieval flow, and wiring tagging and discovery into a usable mobile experience.

The product later continued with another team, but the initial MVP and core technical direction were built in this phase.

Outcome

The MVP established the backend foundation for scale and proved the end-to-end AI tagging and retrieval flow in a real product slice — semantic image discovery in the first shippable version, not just a slide deck.

Beyond storing uploads, the system could analyze, enrich, and connect imagery so exploration was not limited to manual labels.

For the initial MVP, the full image-processing flow took roughly 6.7 seconds end-to-end, including upload, AI tagging, embeddings generation, and indexing. Related-image search returned in about 1.1 to 1.5 seconds, which was fast enough to keep discovery usable while validating the product direction.

Those figures were a deliberate speed-to-market choice, not a consolation prize: the goal was a sub-seven-second full roundtrip and a stable, cost-conscious launch, not weeks of tuning to chase sub-second totals before the product had real users. Where it helped the UX, heavier AI work ran asynchronously so the mobile experience stayed smooth while enrichment and indexing completed.

Tech stack

  • React Native
  • Node.js
  • Express
  • MongoDB
  • AWS
  • Google Vertex AI
  • Vector Search
  • Mobile MVP architecture