MkData

Simple but powerful batch data generator for deterministic or randomized datasets.

Highlights

Lean Python DSL for rapid data synthesis.
Batch-friendly CLI with stdin streaming and file-based workflows.
Extensible via inline Python blocks and a rich prebuilt toolbox (math, random, rint, rstr, rfloat).

Installation

From PyPI

pip install mkdata

Using UV

# Assuming you have UV installed, run:
uv tool install mkdata

From source

Clone the repository: git clone https://github.com/RayZh-hs/mkdata.git && cd mkdata.
Build a wheel: python -m build.
Install the freshest artifact: pip install "$(ls dist/mkdata-*-py3-none-any.whl | sort | tail -n 1)".

Quick start

Run a generator script (conventionally *.gen):

mkdata path/to/script.gen

Reading from stdin instead of a file:

mkdata -

You can find examples to run in the examples/ directory.

Minimal example

Save this file to hello.gen

@run {
  n: rint(1, 100) \n        # Random integer between 1 and 100, inclusive
  @loop n {                 # Repeat n times
    rint(1, 20)             # By default entries concat with space
  }
  \n                        # End with newline
}

mkdata hello.gen

You should see the output in stdout, eg.

5
17 3 19 8 4

A full set of examples is available in the examples/ directory.

Language overview

Sentences

(%)(variable:) expression (\suffix) (# comment)

% hides output while still evaluating the expression.
variable stores the evaluated result in the runtime environment.
expression must be valid Python.
\ suffix customizes the string appended to the output (\n for newline, omit for empty).

Directives

Use @ to introduce a directive. They can be nested and combined freely.

You must use braces { ... } to denote the scope of a directive. Scopes should follow the Google convention for braces (opening brace on the same line, closing brace on its own line, unless an opening brace follows immediately).

Directive	Purpose
`@run { ... }`	Defines an execution scope and starts the interpreter loop.
`@python { ... }`	Executes embedded Python for custom helpers.
`@redirect stdout\|stderr\|path`	Redirects output to standard streams or files.
`@loop count { ... }`	Repeats a block `count` times.
`@for target in iterable { ... }`	Iterates with optional index tracking.
`@any { ... } { ... }`	Picks one block at random (weights optional).

Development

Create a virtual environment and activate it.
Install locally in editable mode: pip install -e ..
Install tooling: pip install pytest.
Run the test suite: pytest.

Formatters, linters, and additional tooling are welcome. See CONTRIBUTING.md for coordination tips.

Project links

Contribution guide: CONTRIBUTING.md
Release highlights: CHANGELOG.md
License: LICENSE

Attributions

This project uses the following open source packages:

pytest for end-to-end testing.

The project's logo is designed by apien from Flaticon.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.github/workflows		.github/workflows
design		design
examples		examples
mkdata		mkdata
public		public
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pypi-info.md		pypi-info.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MkData

Highlights

Installation

From PyPI

Using UV

From source

Quick start

Minimal example

Language overview

Sentences

Directives

Development

Project links

Attributions

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MkData

Highlights

Installation

From PyPI

Using UV

From source

Quick start

Minimal example

Language overview

Sentences

Directives

Development

Project links

Attributions

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages