Skip to main content
Back to Portfolio

Focus Area

Data

Platforms that help organizations collect, process, and derive insights from data at unprecedented scale. Data is the foundation of the AI era.

30%

Vector DB Adoption

Companies by 2026 (Gartner)

$14.3B

Scale AI Investment

Meta investment in 2025

64%

Platform Consolidation

Companies seeking fewer tools

40%

AI-Assisted Pipelines

New pipeline development

Our Investment Thesis

In the AI era, data is more valuable than ever. But most organizations are drowning in data while starving for insights. The winners will be companies that help organizations turn raw data into actionable intelligence.

We're seeing a fundamental shift in how data infrastructure works. The rise of AI workloads, real-time processing requirements, and the explosion of unstructured data are creating demand for entirely new categories of data tools.

Data infrastructure is once again in flux and evolving faster than at any point in recent memory. Organizations should expect the pace of acquisitions to continue as big vendors realize the foundational importance of data to the success of agentic AI.

We invest in companies building the next generation of data infrastructure: from vector databases and streaming platforms to data quality tools and AI-native analytics.

What We Look For

  • AI-native architecture designed for modern workloads
  • Clear differentiation from legacy data tools
  • Strong integration with the modern data stack
  • Demonstrated value in production environments
  • Path to becoming essential infrastructure
  • Teams with deep data engineering expertise

Market Insight

The long-held belief that bigger data leads to better AI is being challenged

With research suggesting high-quality public text data could be depleted as early as 2026, the focus is shifting from data quantity to data quality and freshness. In this new paradigm, stale data is a liability.

Key Trends Shaping the Market

The forces driving innovation and creating new opportunities in this space.

Vector Database Integration

Purpose-built vector databases are becoming core infrastructure, while traditional databases are absorbing vector capabilities for hybrid workloads.

Platform Consolidation

Organizations are moving from 8-12 different vendors to unified platforms. The modern data platform must provide SQL analytics, vector search, and real-time processing as integrated capabilities.

AI-Ready Data

Data products, lakehouse architecture, observability, and augmented management are becoming baseline requirements for organizations building with AI.

Contextual Memory over RAG

For agentic AI, contextual memory is surpassing traditional RAG, enabling LLMs to store and access pertinent information over extended periods.

Data Quality Focus

The shift from data quantity to data quality and freshness. Organizations adopting data quality tools early report faster insights and lower costs.

AI-Assisted Pipeline Development

40% of new data pipeline development efforts in 2025 involve AI assistance, drastically reducing the time and expertise needed for pipeline creation.

Where We See Opportunity

Specific segments and categories where we're actively seeking investments.

Vector Databases

Purpose-built databases for storing and querying embeddings, enabling semantic search and RAG applications. Essential infrastructure for AI-native applications.

Real-Time Data

Streaming platforms and tools for processing and analyzing data as it arrives. Critical for responsive AI applications and operational intelligence.

Data Quality

Tools for ensuring data accuracy, completeness, and reliability across pipelines. The foundation for trustworthy AI systems.

Data Governance

Platforms for managing data access, lineage, and compliance at enterprise scale. Essential for regulated industries and AI compliance.

AI-Native Analytics

Analytics tools that use AI to surface insights and enable natural language querying. Making data accessible to non-technical users.

Data Integration

Modern ETL/ELT tools and data pipelines built for the cloud-native era. Connecting disparate data sources for unified analysis.

Market Landscape

Notable companies and categories shaping this market.

Vector Databases

Pinecone, Weaviate, Milvus, Qdrant, Chroma, pgvector

Data Platforms

Databricks, Snowflake, Confluent, Fivetran, dbt

Data Quality

Monte Carlo, Atlan, Great Expectations, Soda, Bigeye

Real-Time Data

Confluent, Redpanda, Materialize, Rockset, ClickHouse

Data Governance

Collibra, Alation, Immuta, BigID, OneTrust

AI-Native Analytics

ThoughtSpot, Tableau (Salesforce), Sigma, Mode, Hex

Portfolio Companies

Companies in our portfolio building in this space.

OrbioCloud
Series A

OrbioCloud

AI-powered asset and fleet management platform that makes operations simple, efficient, and affordable.

Visit Website
Buffy
Seed

Buffy

AI-powered fitness companion and social platform transforming how people train and connect.

Visit Website

Building in Data?

We're actively investing in this space and would love to hear about what you're building.