I Can Haz Your Copyright?
Even though I’m curious about the potential for AI and exploring small language models (SLMs) at work, it’s stories like Noor Al-Sibai’s reporting for Futurism’s The Byte that give me pause and feed my internal conflict:
OpenAI is begging the British Parliament to allow it to use copyrighted works because it’s supposedly “impossible” for the company to train its artificial intelligence models — and continue growing its multi-billion-dollar business — without them.
To me, this is simple. OpenAI is correct. They can’t continue their growth trajectory without exploiting books, blogs, feeds, websites, images and other content that’s under copyright. This is a flaw in their business model. The machine is hungry and needs to be fed. Public domain content will satiate its hunger for only so long.
But copyright is copyright, and copyrighted works should only be consumed and distributed with the consent of the copyright holder. My advice for folx writing and publishing online? Update your robots.txt files to prohibit crawling from known AI origins. If you need an example, here’s mine.
I’m an techno-optimist. I think we can figure out how to responsibly and ethically leverage AI in our lives. Perhaps the key to doing this is to slow down, and scale down. Take a slow web approach to it. That’s why SLMs are so interesting to me, especially in my specific professional use cases. You can be thoughtful with the application and actively monitor the impact.
I’m interested in your thoughts. Do you think there is any hope for a measured and throttled AI future? Or is this 10x-mindset train already barreling down the tracks toward dystopia?