Q&A β Ask about my profile
Ask any question about 2Z1T Conseil's skills, experience or availability.
π§ Under the hood
The answer is not generated from thin air. Each question triggers a search through my profile (BM25 + vector similarity) to extract relevant passages before feeding them to the model.
The LLM used is Qwen2.5 1.5B (~1 GB). It was selected from several candidates based on a measured relevance score.
Model selection relied on a custom automated evaluation framework built specifically for this use case: a reference question corpus, candidate model execution, relevance scoring. Test, measure, decide β that is precisely my domain.
Rather than a cloud API, I chose a local model running on CPU. Latency is higher β that is the price of a solution that is economically coherent with low usage and fully under my control.