Inside Aperture Analytics: Architecture and Advanced Metadata Extraction
A deep dive into how this site is built—from its modular Flask–Dash architecture to the surprisingly complex world of photo metadata extraction.
Welcome to the technical side of Aperture Analytics.
This site isn’t just a photography portfolio—it’s a sandbox for blending high-end photography with modern data engineering and visualization. In this post, I’ll walk through how the engine under the hood works, and why the way I handle photo metadata is intentionally non-standard.
The Architecture: A Hybrid Approach
At its core, the site is built on Flask, but not in the “single app.py with everything in it” sense. Instead, I’ve leaned into a highly modular architecture that separates responsibilities into three clear layers:
1. Blueprints (Routes)
All routing logic is grouped into logical blueprints (Main, API, Admin). This keeps the application entry point (app.py) clean and focused solely on configuration and initialization, rather than business logic.
2. Service Layer
Database queries never live directly in routes. Instead, I use dedicated service classes (for example, GalleryService and BlogService). The benefit is flexibility: the front end doesn’t care whether data comes from PostgreSQL, flat files, or an external API. That abstraction makes the system easier to extend and refactor over time. The service layer also increases dashboard performance by allowing many calculations to be performed on the database before passing the data to python for rendering.
3. Dash Integration
This is where things get fun. I dynamically load Plotly Dash applications directly into the Flask server. That allows rich, interactive analytics—like the Photo Catalog dashboard—to live seamlessly alongside traditional blog posts and image galleries, without feeling bolted on.
Non-Standard Design Choices
One of the more unusual design decisions is how dashboards are initialized. Rather than letting Dash reach directly into the database, I use a dependency injection pattern: app.py passes specific service methods into each dashboard at startup.
The dashboards never know where the data comes from—they just receive a callable that returns what they need. This keeps them completely decoupled from persistence details and makes them far easier to test or reuse.
The gallery system follows a similar philosophy. Instead of serving images straight from a directory, everything is database-driven. Local staging folders—such as posts/ for articles and Photo_Uploads/Done/ for images—are synced into PostgreSQL. This improves performance and enables advanced filtering and analysis that simply wouldn’t be possible with flat files alone. It also allows me to locally perform metadata extraction from photos before further processing them and posting them to my content delivery network.
The Crown Jewel: Advanced Metadata Extraction
If you’re a photographer, you already know that metadata is more than just “ISO 100.” I wanted Aperture Analytics to capture everything—not just the basics most platforms expose or preserve.
Most websites either strip metadata entirely or only read a shallow subset of EXIF tags. I went much deeper. By treating a photo catalog like a small data warehouse, I’ve created a space where photography and analytics genuinely intersect, highlighting both the creative and technical effort behind each image.
1. Multi-Format & Multi-Source Support
The extraction pipeline handles standard JPEGs, but it’s specifically designed with professionals in mind—particularly those working with Adobe DNG RAW files. Most serious photographers use a combination of formats and rely heavily on their RAW images, using jpgs just for display purposes.
2. The XMP Sidecar Strategy
In professional workflows (for example, Adobe Lightroom), edits and updated metadata are often written to .xmp sidecar files instead of the original image. The extraction script (catalog_images.py) accounts for this by:
- Detecting the presence of an
.xmpfile - Comparing modification times between the image and its sidecar
- Prioritizing XMP data, which typically contains the most current keywords, ratings, and location information
3. Deep IFD Traversal
RAW files are effectively complex TIFF structures. A lot of valuable technical metadata is buried in nested Image File Directories (IFDs), such as ExifOffset and GPSInfo. Standard libraries often stop too early.
Here, the logic explicitly traverses those nested structures to extract professional-grade fields like Lens Model, Focal Length, and even Subject Distance.
4. Intelligent Data Normalization
Metadata is messy. Shutter speeds might appear as opaque decimals (for example, 0.01666 instead of 1/60). Normalization logic converts these values into consistent, interpretable formats.
My Photography Practice, in Numbers
You can see this metadata in action on the Photo Metadata Dashboard. Right now, it focuses on the basics, but that’s intentional—the foundation is there, and I’m excited to see what patterns emerge as I layer in more dimensions over time.
One area I’m particularly excited about is combining GPS data with the International Ornithologists’ Union bird taxonomy, which I’ve already loaded into Lightroom as keywords. That opens the door to analyzing my photography not just by camera settings or location, but by species, habitat, and season as well.
This is very much a living applied system—and that’s exactly the point.