I specialize in data engineering and big data systems at scale.

  • I help organizations handle massive data workloads efficiently—from petabyte-scale storage migrations to real-time query optimization and distributed system architecture.
  • I build production-grade tooling loved by teams.
  • Custom file formats and unstructured data don't scare me.
  • I contribute to open-source data infrastructure projects.
GitHub Contribution Graph

Technologies I (like to) work with

Clojure Clojure Rust Rust Scala Scala TypeScript TypeScript Spark Spark DataFusion DataFusion Kafka Kafka vert.x vert.x AWS AWS Kubernetes Kubernetes SQL SQL Elasticsearch Elasticsearch Redis Redis

Let's collaborate

  • Building performant data infrastructure and query engines
  • Scaling your data organization across teams
  • Contributing to open-source database projects
Get in touch

Email: dstancu [at] nyu [dot] edu

PGP: 540FFD1702007D89

Open Source Contributions

Here are some things I've worked on.

Apache DataFusion

datafusion-contrib/datafusion-table-providers
Enabled federation to AWS Redshift (and older PG) via special “quirk” enabling PostgreSQL 8 compatible schema inference (no JSON, composites, enums, etc)
✓ Merged Jul 31, 2025 +1069 -45
apache/datafusion
Fix proto reification of ListingScan nodes when projecting partition columns
✓ Merged Oct 7, 2025 +52 -13
apache/datafusion
Preserve schema metadata for DataSourceExec / FileScanConfig across proto ser/de boundary
✓ Merged Oct 14, 2025 +59 -11
apache/datafusion-ballista
Fix executor panics when default tmpfs is not writeable
✓ Merged Oct 24, 2025 +10 -4
spiceai/datafusion-ballista
Made Ballista clients scheduler-agnostic by implementing catalog RPC. Clients can now create logical plans and query any scheduler independently of server customizations—without knowing about custom table providers, logical nodes, or data source configurations. The scheduler’s catalog is now accessible remotely, eliminating the need for clients to replicate data source setup.
✓ Merged Oct 25, 2025 +964 -38

OpenJDK

CVE-2023-22025 CVSS 5.1
JVM HotSpot AES-NI IV counter overvlow in aarch64 JIT intrinsic implementation.
CVE

DuckDB

duckdb/duckdb-java
Enabled use of the Appender API for raw byte[] appends, eliminating an extra copy when called from Java
✓ Merged Dec 13, 2024 +95 -2
duckdb/duckdb
Improved ART index projection/column binding logic to fix index scans for views that had column orders differing from that of the underlying table
✓ Merged Aug 28, 2025 +83 -2
spiceai/duckdb
Enables ART index scans with composite keys by implementing a new ART index scan and state for equality assertions. Before this change, composite keys were only supported for enforcing uniqueness constraints. Now they can be queried too!
✓ Merged Nov 3, 2025 +247 -50

Spice AI

spiceai/spiceai
Wrote ODBC data connector enabling federation to ~hundreds of databases (and platforms like Salesforce)
✓ Merged Apr 29, 2024 +1358 -301
spiceai/spiceai
Added static embedding model support via Model2Vec with in-process parallelism to increase vector embedding throughput by over 10x
✓ Merged Aug 26, 2025 +1320 -49
spiceai/spiceai
Added UDF for vector embeddings to allow SQL/DF native expression of vector embeddings during ingestion and search
✓ Merged Sep 9, 2025 +546 -63
spiceai/spiceai
Introduced hybrid search UDTF with an easy high-level API for Reciprocal-Rank-Fusion, supporting variadic subqueries, custom smoothing, custom join key, and rank/recency boosting with customizable decay.
✓ Merged Sep 14, 2025 +738 -16
spiceai/spiceai
Integrated Apache Ballista to scale database runtime past single-process to a clustered, horizontally scalable execution model. Implemented physical plan optimizer pipeline that increased data lake scan performance 5-7x over the equivalent Spark SQL query (dynamic sizing & parallelization of DataSourceExec, projection pushdown).
✓ Merged Oct 24, 2025 +2291 -135

Retrocomputing

mach-kernel/cadius
Implemented ser/de for AppleSingle files to/from ProDOS for the popular CADIUS disk image utility
✓ Merged Mar 23, 2018 +472 -56

Swagger API

swagger-api/swagger-codegen

Meta-PR that fixed:

  • OAuth 2 for Python, Ruby, PHP
  • Nested DTO ser/de
  • LOCATION header consistency
  • Support for hypermedia style path identifiers
Opened Oct 26, 2015 +303 -100

Misc

komamitsu/fluency
Feature to allow customization of SSLSocketFactory for Java fluentd/fluent bit ingestion logger
✓ Merged Apr 22, 2022 +62 -8
newrelickk/logback-newrelic-appender
GZIP NewRelic API requests to allow sending larger frames
Opened Apr 25, 2025 +15 -7
nulldb/nulldb
Fixed count(*) queries for widely used no-op ActiveRecord adapter
✓ Merged Jul 7, 2018 +55 -12
davidcelis/api-pagination
Fixed string inflector for popular Rails API pagination library
✓ Merged Jul 17, 2018 +91 -11
watsonbox/pocketsphinx-ruby
Remove a deprecated function from FFI bindings
✓ Merged Jul 25, 2017 +6 -2
ruby-grape/grape-roar
Wrote ActiveRecord/Mongoid relation extension for ruby-grape’s hypermedia presenter library, allowing easy declaration -> auto-generation of HAL links from ORM relationships
✓ Merged Jul 3, 2017 +1556 -62
jekyll/classifier-reborn
Enabled popular classifier library to run on JRuby by hooking up a native Java stemming library
✓ Merged Nov 20, 2017 +92 -4
kashifrazzaqui/json-streamer
Fixed string deserialization for Python YAJL library
✓ Merged Mar 14, 2016 +4 -1