Rendered at 22:23:23 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
saulpw 4 minutes ago [-]
47 releases in 3 weeks. Dozens of meaningless cruft files committed at the project root, like "h .kore_fileformat_source_v0.1.0.zip -Algorithm SHA256" with a summary of "less" commands. 19 deployments, all failed.
A+++ pure slop. If there's any value in this repo, I couldn't find it after a cursory examination, and I'm not willing to expend the effort to continue looking.
miohtama 7 minutes ago [-]
How it achieves the performance? What is the tradeoff?
theginger 1 hours ago [-]
What does it do better than parquet?
The compression ratio quoted is lower than parquet but I expect higher to be better in this context
arpadav 47 minutes ago [-]
looks like a cool project, but id say keep working on it since there seems to be some confusion on why someone would want to use this: no benchmarks and overall pretty vibe-codey (which id personally be very hesitant to use in production)
another comment already mentioned comparison to vortex, which is the same compression ratio and same speeds as youre claiming - but your compression is half of parquet. and if speed is the main goal youre going for, python is an interesting choice. no hate, but def keep working on it, and would love to see more concrete benchmarks with various columnar store types
microflash 58 minutes ago [-]
> The fastest, most compressed columnar format for big data
How large a dataset can it tackle? I work with Parquet files spanning 300million+ records (~800MB files) using DuckDB and it works within seconds.
I might be interested to see benchmarks against Parquet and Vortex. A DuckDB extension would be great as well.
arunkore2026 1 hours ago [-]
A binary file format built from first principles for modern data systems. Parse 100MB 50x faster than JSON, with 50-70% better compression. Full language support (Python, Java, JavaScript, Go, C#, Ruby). Includes a VS Code extension for viewing .kore files. 3 years of production testing before open source release.
inheritedwisdom 1 hours ago [-]
Curious what you see as key differentiators over parquet / iceberg formats with snappy or similar compression schemes?
A+++ pure slop. If there's any value in this repo, I couldn't find it after a cursory examination, and I'm not willing to expend the effort to continue looking.
another comment already mentioned comparison to vortex, which is the same compression ratio and same speeds as youre claiming - but your compression is half of parquet. and if speed is the main goal youre going for, python is an interesting choice. no hate, but def keep working on it, and would love to see more concrete benchmarks with various columnar store types
How large a dataset can it tackle? I work with Parquet files spanning 300million+ records (~800MB files) using DuckDB and it works within seconds.
I might be interested to see benchmarks against Parquet and Vortex. A DuckDB extension would be great as well.