Show HN: Plexe – ML Models from a Prompt

1 month ago 3

PyPI version Discord

backed-by-yc

Build machine learning models using natural language.

Quickstart | Features | Installation | Documentation


plexe lets you create machine learning models by describing them in plain language. Simply explain what you want, and the AI-powered system builds a fully functional model through an automated agentic approach. Also available as a managed cloud service.

demo.mp4

You can use plexe as a Python library to build and train machine learning models:

import plexe # Define the model model = plexe.Model( intent="Predict sentiment from news articles", input_schema={"headline": str, "content": str}, output_schema={"sentiment": str} ) # Build and train the model model.build( datasets=[your_dataset], provider="openai/gpt-4o-mini", max_iterations=10 ) # Use the model prediction = model.predict({ "headline": "New breakthrough in renewable energy", "content": "Scientists announced a major advancement..." }) # Save for later use plexe.save_model(model, "sentiment-model") loaded_model = plexe.load_model("sentiment-model.tar.gz")

2.1. 💬 Natural Language Model Definition

Define models using plain English descriptions:

model = plexe.Model( intent="Predict housing prices based on features like size, location, etc.", input_schema={"square_feet": int, "bedrooms": int, "location": str}, output_schema={"price": float} )

2.2. 🤖 Multi-Agent Architecture

The system uses a team of specialized AI agents to:

  • Analyze your requirements and data
  • Plan the optimal model solution
  • Generate and improve model code
  • Test and evaluate performance
  • Package the model for deployment

2.3. 🎯 Automated Model Building

Build complete models with a single method call:

model.build( datasets=[dataset_a, dataset_b], provider="openai/gpt-4o-mini", # LLM provider max_iterations=10, # Max solutions to explore timeout=1800 # Optional time limit in seconds )

2.4. 🚀 Distributed Training with Ray

Plexe supports distributed model training and evaluation with Ray for faster parallel processing:

from plexe import Model # Optional: Configure Ray cluster address if using remote Ray # from plexe import config # config.ray.address = "ray://10.1.2.3:10001" model = Model( intent="Predict house prices based on various features", distributed=True # Enable distributed execution ) model.build( datasets=[df], provider="openai/gpt-4o-mini" )

Ray distributes your workload across available CPU cores, significantly speeding up model generation and evaluation when exploring multiple model variants.

2.5. 🎲 Data Generation & Schema Inference

Generate synthetic data or infer schemas automatically:

# Generate synthetic data dataset = plexe.DatasetGenerator( schema={"features": str, "target": int} ) dataset.generate(500) # Generate 500 samples # Infer schema from intent model = plexe.Model(intent="Predict customer churn based on usage patterns") model.build(provider="openai/gpt-4o-mini") # Schema inferred automatically

2.6. 🌐 Multi-Provider Support

Use your preferred LLM provider, for example:

model.build(provider="openai/gpt-4o-mini") # OpenAI model.build(provider="anthropic/claude-3-opus") # Anthropic model.build(provider="ollama/llama2") # Ollama model.build(provider="huggingface/meta-llama/...") # Hugging Face

See LiteLLM providers for instructions and available providers.

Note

Plexe should work with most LiteLLM providers, but we actively test only with openai/* and anthropic/* models. If you encounter issues with other providers, please let us know.

3.1. Installation Options

pip install plexe # Standard installation pip install plexe[lightweight] # Minimal dependencies pip install plexe[all] # With deep learning support
# Set your preferred provider's API key export OPENAI_API_KEY=<your-key> export ANTHROPIC_API_KEY=<your-key> export GEMINI_API_KEY=<your-key>

See LiteLLM providers for environment variable names.

For full documentation, visit docs.plexe.ai.

See CONTRIBUTING.md for guidelines. Join our Discord to connect with the team.

Apache-2.0 License

  • Fine-tuning and transfer learning for small pre-trained models
  • Use Pydantic for schemas and split data generation into a separate module
  • Plexe self-hosted platform ⭐ (More details coming soon!)
  • Lightweight installation option without heavy deep learning dependencies
  • Distributed training with Ray on AWS
  • Support for non-tabular data types in model generation

If you use Plexe in your research, please cite it as follows:

@software{plexe2025, author = {De Bernardi, Marcello AND Dubey, Vaibhav}, title = {Plexe: Build machine learning models using natural language.}, year = {2025}, publisher = {GitHub}, howpublished = {\url{https://github.com/plexe-ai/plexe}}, }
Read Entire Article