Framework Command Center

Load and manage pre-trained open-source parameters mapped directly to your custom LLaMA implementation.

Load Hugging Face LLM Models

Choose a pre-trained open-source model. The framework will download config and weights from HF and convert them on-the-fly to custom LLaMA structures.

3D Distributed Parallelism Status

Data Parallelism (DP)

NAIVE ALL-REDUCE

Gradient synchronization hooks for multi-process scaling.

Tensor Parallelism (TP)

COLUMN & ROW SHARDED

Simulated multi-GPU row/col sharding of weights.

Pipeline Parallelism (PP)

AFAB / 1F1B ENGINE

Scheduled batching across layer pipelines.

SmolLM-135M
Custom LLaMA model initialized with Hugging Face weights. Start typing below to generate completions.

Generation Parameters

Tweak the autoregressive sampler settings

0.7
High values lead to creative answers, 0.0 is deterministic.
0.9
Filters candidate tokens with cumulative probability above P.
50
Limits token pool to top K highest probabilities.
128

Fine-Tuning Console

Optimize the loaded open-source model weights on specific text inputs.

Training Visualizer

IDLE
Interactive Logs
> Standby. Ready to train model.

Hugging Face Hub Deployment

Directly convert your optimized state weights back into a standard LLaMA structure and upload to Hugging Face.

HF

Host Model on Hugging Face

Deploying under your profile namespace: Aravindhan11

Your repository will be publicly created at https://huggingface.co/Aravindhan11/Distributed-Llama-Model
You can create a token in your Hugging Face Settings. Safe and handled strictly in local memory.