Pinned Loading
-
-
safety-concept-vectors
safety-concept-vectors PublicExtracting and validating safety concept vectors (eval-awareness, deception, sycophancy, etc.) from open-weight LLMs — extending Anthropic's emotion vectors methodology to alignment-critical concepts
Python 1
-
eval-awareness-detection
eval-awareness-detection PublicMechanistic detection of eval-awareness in language models via representation engineering
Python 1
-
does-quantization-kill-interpretability
does-quantization-kill-interpretability PublicDoes Quantization Kill Interpretability? Scaling study across 5 models (124M-2.8B): RTN destroys induction heads in small models, GPTQ preserves them at all scales.
Python 1
-
gptq-from-scratch
gptq-from-scratch PublicGPTQ post-training quantization from scratch — GPT-2, OPT, LLaMA support
Jupyter Notebook 1
-
If the problem persists, check the GitHub status page or contact support.