CSV data preprocessing online.

35+ commands all in one place for you to use.
qsv works with CSV files to slice, analyze, enrich, validate & much more!

apply

apply

Apply series of string, date, math, currency & geocoding transformations to a CSV column. Also has basic NLP functions (similarity, sentiment analysis, etc.).

applydp

applydp

A slimmed-down version of apply with only Datapusher+ relevant subcommands/operations (qsvdp binary variant only).

Available on the qsv CLI tool!

behead

behead

Drop headers from a CSV.

cat

cat

Concatenate CSV files by row or by column.

count

count

Count the rows in a CSV file.

dedup

dedup

Remove duplicate rows. (See also extdedup, extsort, sort & sortcheck commands).

diff

diff

Find the difference between two CSVs.

enum

enum

Add a new column enumerating rows by adding a column of incremental or uuid identifiers. May also use to copy a column or fill a new column with a constant value.

excel

excel

Exports a specified Excel/ODS sheet to a CSV file.

exclude

exclude

Removes a set of CSV data from another set based on the specified columns.

Available on the qsv CLI tool!

explode

explode

Explode rows into multiple ones by splitting a column value based on the given separator.

Available on the qsv CLI tool!

extdedup

extdedup

Remove duplicate rows from an arbitrarily large CSV/text file using a memory-mapped, on-disk hash table. Unlike dedup, extdedup doesn't sort the deduped file.

Available on the qsv CLI tool!

extsort

extsort

Sort an arbitrarily large CSV/text file using a multithreaded external merge sort algorithm.

Available on the qsv CLI tool!

fetch

fetch

Fetches data from web services for each row w/ HTTP GET. Has HTTP/2 adaptive flow control, jql JSON query language support, dynamic throttling (RateLimit) & more.

Available on the qsv CLI tool!

fetchpost

fetchpost

Similar to fetch, but uses HTTP Post. (HTTP GET vs POST methods)

Available on the qsv CLI tool!

fill

fill

Fill empty values.

Available on the qsv CLI tool!

fixlengths

fixlengths

Force a CSV to have same-length records by either padding or truncating them.

flatten

flatten

A flattened view of CSV records. Useful for viewing one record at a time. e.g. qsv slice -i 5 data.csv | qsv flatten.

fmt

fmt

Reformat a CSV with different delimiters, record terminators or quoting rules. (Supports ASCII delimited data.)

foreach

foreach

Loop over a CSV to execute shell commands. (not available on Windows)

Available on the qsv CLI tool!

frequency

frequency

Build frequency tables of each column. Uses multithreading to go faster if an index is present.

generate

generate

Generate test data by profiling a CSV using Markov decision process machine learning.

Available on the qsv CLI tool!

✨ NEW!
geocode

geocode

Geocode CSV data using an updatable copy of the Geonames Cities database.

Available on the qsv CLI tool!

headers

headers

Show the headers of a CSV. Or show the intersection of all headers between many CSV files.

Available on the qsv CLI tool!

index

index

Create an index for a CSV with constant time indexing/random access. Multiple speed benefits & features when an index is present.

Available on the qsv CLI tool!

input

input

Read CSV data with special quoting, trimming, line-skipping & UTF-8 transcoding rules. Typically used to 'normalize' a CSV for processing with other commands.

join

join

Inner, outer, right, cross, anti & semi joins. Automatically creates a simple, in-memory hash index to make it faster.

joinp

joinp

Unlike join, can process files larger than RAM, multi-threaded, has join key validation, pre-join filtering, supports asof joins & its output doesn't have duplicate columns.

jsonl

jsonl

Convert newline-delimited JSON (JSONL/NDJSON) to CSV. See tojsonl command to convert CSV to JSONL.

Available on the qsv CLI tool!

luau

luau

qsv's Domain-Specific Language (DSL) for data-wrangling. Create multiple new computed columns, filter rows or compute aggregations by executing scripts.

partition

partition

Partition a CSV based on a column value.

Available on the qsv CLI tool!

pseudo

pseudo

Pseudonymise the value of the given column by replacing them with an incremental identifier.

Available on the qsv CLI tool!

py

py

Create a new computed column or filter rows by evaluating a Python expression on every row of a CSV, use f-strings for extended formatting, & evaluate expressions.

Available on the qsv CLI tool!

rename

rename

Rename the columns of a CSV efficiently.

Available on the qsv CLI tool!

replace

replace

Replace CSV data using a regex. Applies the regex to each field individually.

reverse

reverse

Reverse order of rows in a CSV. Unlike the sort --reverse command, it preserves the order of rows with the same key.

safenames

safenames

Modify headers of a CSV to only have 'safe' names - guaranteed 'database-ready'/'CKAN-ready' names.

sample

sample

Randomly draw rows (with optional seed) from a CSV using reservoir sampling (i.e., use memory proportional to the size of the sample).

schema

schema

Infer schema from CSV data, replete with data type & domain/range validation & output in JSON Schema format.

search

search

Run a regex over a CSV. Applies the regex to each field individually & shows only matching rows.

searchset

searchset

Run multiple regexes over a CSV in a single pass. Applies the regexes to each field individually & shows only matching rows.

select

select

Select, re-order, duplicate or drop columns.

slice

slice

Slice rows from any part of a CSV. When an index is present, this only has to parse the rows in the slice (instead of all rows leading up to the start of the slice).

snappy

snappy

Does streaming compression/decompression of the input using Google's Snappy framing format.

Available on the qsv CLI tool!

sniff

sniff

Sniff & infer CSV metadata (delimiter, header row, preamble rows, quote character, flexible, is_utf8, & much more). Also a general mime type detector.

sort

sort

Sorts data in alphabetical (optionally case-insensitive), numerical, reverse, unique or random (optional seed) order.

sortcheck

sortcheck

Check if a CSV is sorted. May also optionally retrieve record count, sort breaks & duplicate count.

split

split

Split one CSV file into many CSV files of N chunks. Uses multithreading to go faster if an index is present.

Available on the qsv CLI tool!

table

table

Show aligned output of a CSV using elastic tabstops. May pair well with csvlens.

to

to

Convert CSV files to PostgreSQL, SQLite, XLSX, Parquet and Data Package.

Available on the qsv CLI tool!

tojsonl

tojsonl

Convert CSV to a newline-delimited JSON (JSONL/NDJSON). Also infers JSON data types.

transpose

transpose

Transpose rows/columns of a CSV.

validate

validate

Validate CSV data using JSON Schema Validation, along with invalid records output and an error report.