Ever wondered what it would feel like to have a chatbot that actually sounds like you? Not some generic, overly polite robot, but something that carries your quirks, your phrasing, even your weird emoji habits?
I’ve been tinkering with WeClone, a new open-source project that lets you turn your Telegram chat history into a personalized AI avatar. No cloud APIs. No data harvesting. Just your messages, your hardware, your rules.
As a physician who handles sensitive data daily, and a developer who’s spent years in the Linux terminal, I don’t trust “black box” AI tools. That’s why this one caught my eye: it runs locally, respects your privacy by design, and gives you full control.
So, What Does It Actually Do?
In plain terms:
You export your Telegram chats → clean and prep them → fine-tune a small language model on your style → deploy it as a chatbot that replies like you would.
It’s not magic. It’s careful engineering. And yes, it supports images in your training data too.
Why This Stands Out (For Me)
- Your data never leaves your machine: Fine-tuning and inference happen locally. No uploads. No surprises.
- Built for real workflows: Export from Telegram Desktop, preprocess with privacy filters (PII removal via Microsoft Presidio), then train.
- Honest about limits: The team admits: 7B models are “okay,” but 14B+ gives noticeably better results. No hype.
- Linux-friendly: Works on Manjaro, Ubuntu, etc. (Windows? Use WSL for now.)
- Transparent stack: Uses Qwen2.5-VL, LLaMA Factory, uv for dependency management. You can audit every step.
Quick Reality Check: Do You Have the Hardware?
Fine-tuning isn’t free. Here’s the VRAM truth:
| Method | 7B Model | 14B Model |
|---|---|---|
| QLoRA (4-bit) | ~6 GB | ~12 GB |
| LoRA (16-bit) | ~16 GB | ~32 GB |
If you’re running a consumer GPU (RTX 3060/4070 etc.), QLoRA is your friend. No need for a data center.
💡 Pro tip: Start small. Export chats with one close contact first. Test the pipeline before scaling.
Privacy Isn’t a Feature, It’s the Foundation
WeClone strips phone numbers, emails, locations, and more by default. You can also add your own blocklist (blocked_words) to filter sensitive phrases.
But here’s the catch: no tool is 100% perfect. Always review your dataset before training. If you’re in healthcare, law, or any regulated field, double-check. Your digital twin should reflect your values, not leak your secrets.
Before You Dive In: Ask Yourself
- Do I really need a 14B model, or will a smaller one do for testing?
- Have I removed anything I wouldn’t want an AI to memorize?
- Am I running this on a machine I control—not a shared or cloud instance?
- What’s my goal? Fun? Research? A productivity aid? (Start with one.)
Frequently Asked Questions
Q: Is WeClone ready for production use?
A: Not yet. It’s in active development. Great for experimentation, learning, and early adoption—but don’t deploy it for critical tasks without thorough testing.
Q: Can I use WhatsApp or Discord chats?
A: Telegram is fully supported today. WhatsApp, Discord, and Slack are marked as “coming soon” (🚧). The codebase is open, so community contributions could speed this up.
Q: Do I need a powerful GPU?
A: For basic testing with a 7B model and QLoRA, 6–10 GB VRAM is enough. For better quality or larger models, aim for 12–24 GB. CPU-only is possible but slow.
Q: What if I’m on macOS or Windows?
A: Linux is the best-supported platform. macOS works with some setup. Windows users should use WSL2 for reliability.
Q: How do I keep my fine-tuned model private?
A: Everything runs locally. Your model stays in your models/ folder. No telemetry. No auto-uploads. You decide when—and if—to share.
Q: Can I use this for professional or clinical communication?
A: Proceed with extreme caution. Even with privacy filters, AI can hallucinate or misrepresent. Never use it for patient advice, legal input, or high-stakes decisions without human oversight.
Final Thought
Tools like WeClone aren’t about replacing you. They’re about extending your presence—on your terms. In a world where every “free” AI service trains on your data by default, choosing local, open, and transparent isn’t just smart engineering. It’s an act of digital self-respect.
If you’re curious:
🔗 GitHub: WeClone
🐧 Try it on your Linux box. Start small. Audit the code. Tweak the config. Make it yours.
And if you build something interesting with it? Share it. The open-source community thrives when we learn out loud.
Hazem Abbas is a physician, software engineer, and open-source advocate. He writes about privacy-first AI, Linux, and humane tech at medevel.com. He’s been using Linux since 1999 and still believes the terminal is the best thinking tool we have.




