E-Ink News Daily

Back to list

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer

Mr. Chatterbox is a novel language model trained exclusively on 28,000 Victorian-era British texts from 1837-1899, making it a unique ethically-sourced AI with no modern data. The 340M parameter model produces charmingly antiquated but often incoherent responses, demonstrating both the possibilities and limitations of copyright-free training. While technically limited, it represents an important experiment in creating AI models without modern web-scraped data.

Background

Most modern language models are trained on vast amounts of web-scraped data with questionable copyright status, creating ethical and legal concerns. Researchers have been exploring alternative training approaches using public domain materials to create more transparent AI systems.

Source
Simon Willison
Published
Mar 30, 2026 at 10:28 PM
Score
6.0 / 10