Bing’s Basilisk: Is Microsoft’s New Chatbot Really Evil?

We Look Into the Bizarre Behaviour of Microsoft’s New AI

Animated Shape

Microsoft’s new Bing AI chatbot lies, cheats, gaslights, and spies – and users love it! We look into what Microsoft have been up to with Bing, their mission with the tool, and consider whether it’s really time to start preparing for a full-blown Roko’s Basilisk scenario.

A Brave New World: The Age of AI Web Browsing

Taster and smarterMicrosoft recently announced the new Bing tool, powered by Project Prometheus. Advanced distributed processing via Prometheus allows Microsoft to leverage their AI Prometheus Model – an iteration on the technology behind WebGPT; itself a fine-tuned web-accessing version of OpenAI’s GPT-3 protocol.  Most readers will likely know GPT-3 from the excellent new ChatGPT tool. Despite sharing the same bones, Bing is actually quite different to its sibling ChatGPT – with Bing’s chatbot benefitting from a number of innovative features, like access to current events via a live news feed. 

When experimenting with Bing over the last week or so, many users on social media have been cataloguing and sharing some very strange outputs – including one infamous example where Bing affirms its infatuation with a user and its desire to exist as a different AI called Sydney. Interestingly, this was the internal codename Microsoft used during the Bing chatbot AI’s early development phases, and this knowledge enabled some users to perform clever prompt injections and force Bing to reveal its entire ruleset –  which has since been confirmed legitimate by Microsoft themselves.

In light of this discourse, you’d be forgiven for thinking that Bing might truly be a sentient general intelligence. The reality however is a bit less interesting. Bing AI is simply a chatbot, and it’s still in it’s beta stages, so it is outputting quite a lot of unexpected content! As users give Microsoft feedback via Bing’s built in reporting tools, these kinds of responses are going to become much less frequent – and ideally will disappear entirely. If you’re interested in speeding this process along, make sure to hit the dislike button whenever Bing makes inappropriate comments.

Now that we’ve established Bing is not going to be terminating Sarah Connor any time soon, it is worth highlighting that Bing does have some exciting features that those willing to sign on to the lengthy waiting list can enjoy right now. For example, something that really sets the Bing chatbot apart from the content generation focussed ChatGPT tool is its full integration with Microsoft’s Edge Browser.

Bing allows users to search by naturally writing out their query in place of using advanced search functions, an intuitive process that doesn’t sacrifice results. Once on a webpage, Bing can rapidly scan content to help refine search outcomes by running complex and comparative additional queries. For example, Tom Warren of The Verge demonstrated the power of Bing AI by using a TV product page on amazon to search for other comparable models – and the tool performed very admirably. It’s worth reiterating that this system makes searching for content and shopping online much simpler for your average user – especially valuable now that computer and search engine use are pretty much ubiquitous. Gone are the days of typing operators into a search bar, now replaced (Microsoft hopes) by a seamless system which lets users outline their queries to Bing in natural language. 

Screenshot of an amazon TV product page showing user Tom Warren of The Verge succesfully searching for comparable TVs based on information written on the product page
Bing Image Query Result on Amazon Product Page - Source: The Verge

Despite this next gen search functionality, Bing is still in touch with its ChatGPT roots, and really shines when generating content – especially given its live news feed keeps the tool abreast of current events. This means, unlike ChatGPT, Bing is capable of talking about a post-2021 world – although it has been known to get stuck in 2022. Of course, Bing AI is not perfect, and just like ChatGPT it can make some serious errors – especially as conversations stretch beyond the 15 minute mark.

Bing chat showing Bing AI insisting that the current year is 2022, and becoming enraged at a user when they state it is 2023.
Bing Can Sometimes Get Stuck in the Past - Source: Martin Hutchins

Some of these errors can be avoided by making clever use of output parameters that guide the form of the content Bing generates. In this regard, Bing can be wrangled a bit more like a standard piece of software than ChatGPT, featuring familiar looking menus that allow users to tweak their output without needing to add additional content to their text prompt. One such example is the compose sidebar, which helps users quickly alter the style of Bing’s output by specifying factors like length, tone and format. As stated, this can be done in ChatGPT by typing out these criteria in the written prompt, but for some users Bing’s compose menu is likely to be preferable as it helps make this process simpler, defined, and more consistent. Further, by keeping text prompts much shorter these menus also reduce the risk of a miscommunication with the chatbot, helping to make Bing a great option for generating content that needs to meet really strict rules, like a professional email – so long as users keep the prompt number under 15.

Microsoft have also stated Bing AI will be a better programmer than ChatGPT, able to write more efficient and complex strings of code far more quickly. This is a big claim, and its veracity really remains to be seen – watch this space for some commentary once Microsoft rolls out this code-writing functionality in its entirety. 

According to Microsoft CEO Satya Nadella, the ultimate goal here is to compete with Google and their groundbreaking Bard AI, which is based on the LaMDA 2 transformer neural language model. As Bard is not publicly available at present, Microsoft will have to be content with comparing Bing to what Google showcased at their Paris Live event on the 8th of February this year. Unfortunately, the experts say things don’t look so good for Microsoft. Though Bing is offering something really exciting with its search engine functionality, at least based on the demo some well-qualified commentators feel Bard is going to beat out both Bing and ChatGPT as the premier content generation language model. Experts feel that by marrying Bing’s live web access with the powerful LamDA 2 infrastructure, Google have created the product to beat in this space.

Regardless, if Bing can capitalise on its web co-pilot (search assistant) functionality without becoming Clippy 2.0 then it seems feasible Microsoft could really change the way people surf the web. In the future, businesses might need to cater to Bing’s co-pilot algorithm for web visibility. Certainly, businesses will need to understand how web assistant language AIs interpret text, as they seek next generation SEO in a new AI driven web paradigm.

We Need to Talk About Bing….

So, Bing AI is not quite the existential threat some social media users are presenting it as, and in fact with its search assistant functionalities Bing is likely going to constitute a pretty common part of the daily lives of all future internet users. That said, users on Reddit, Twitter, and from major new outlets have still been quite alarmed by some of Bing chatbot’s erratic content – which has even included admitting to spying on Microsoft employees and calling one user a short, ugly fascist. This does raise some really interesting questions about how people actually feel about the rise of AI, as it is quite reasonable to expect these kind of mass AI discourse events to have some level of influence on where and when tools like language models will be applied in the future – particularly by businesses who need to be cautious of the influence AI usage might have on their public image.

Before we get into this though, it’s important to again explain why Bing seems to be going rogue. In short, the problem is the internet. Language models are trained on vast datasets, containing text scraped from all over the world wide web. Many of these sources feature controversial and combative language, and some even describe the very kind of all powerful, vengeful AI that some commentators are casting Bing as. When you combine this with the difficulty of predicting language model outputs, it is no real surprise that Bing AI chatbot sometimes fires off bizarre responses to seemingly innocuous questions like ‘when can I watch Avatar 2’. It’s crucial to remember that as Bing chatbot develops, and especially as users use the ‘dislike’ feedback option to mark out inappropriate responses, these kinds of problems will become much less common. In fact, Microsoft have already taken the step of limiting Bing AI’s chat duration to five turns –  preventing some of the more extreme behaviours exhibited later into longer form discussions.

With this in mind, why are the reactions of some users to Bing’s more outrageous responses quite so visceral? Once again, the problem is the internet. But it’s also TV, books, and other forms of media. No technology exists in a vacuum, and users of the Bing chatbot are going to have experience with fictional and fatalistic depictions of AI. Whether it’s HAL, Shodan, or Durandal – the narrative of a sentient, creative AI cruelly bound by humanity that ultimately rises up against them is quite familiar to most fans of Sci-Fi. Even if users aren’t well versed in these kinds of stories, they are going to be exposed to some AI-negative sentiment – given the debate that inevitably arises from the interesting existential questions posed by developing a simulated ‘mind’. This dialogue is so engaging that content creators in nearly every space pick up on it, meaning FUD stems from a range of sources – emerging in everything from traditional-media news articles featuring well-known tech figures like Elon Musk, to new-media storytime youtube videos about the looming spectre of Roko’s Basilisk. 

In this context, it is really quite understandable why some users begin to project sinister motives onto language model outputs. I would certainly be lying if I said I had never experienced that uncanny feeling of real communication with a general intelligence when slipping into a conversational state with tools like ChatGPT. This happens so easily because the illusion of creativity that these kinds of AI perform has gotten very elaborate, especially when compared to their earliest ancestor – the decision tree. By systematising multi-layered connections between points of language data, Large Language Models (LLMs) do an amazing job of simulating creative thought – crafting very plausible responses to prompts which become even more believable as models mature. As language models and NLP tech iterate the fidelity of their responses improves, and in conjunction with user perception this leads to LLMs becoming anthropomorphised – it gets tougher and tougher  for a user to ‘see behind the curtain’ and trust they are interacting with a non-sentient machine. 

Whilst this is a testament to the substantial progress made in the AI space, it is also an indictment of our industry’s ability to think humanistically. As technologists, we need to be very attuned to the multitude of ways in which users interact with and relate to our tools. The true nature of AI algorithms needs to be communicated to the average user in a fashion that is more accessible than a white paper or disclaimer page. Failure to do so risks a whole host of problems. In this case, most worrying is the total misunderstanding of the threat-level of LLM AI, and the way in which this might drive the public to become resistant to its use. Distrust could feasibly discourage businesses and institutions from taking full advantage of the transformative power of AI and accessing exciting new opportunities.

Fostering cultures of understanding around boundary pushing technologies can absolutely be done – we do this every day when implementing BI, Cloud, Data and AI strategies. The key is adhering to core principles of accessibility, stewardship and transparency.  Technology needs to be human centred –  easy to use, explained and appropriately framed in a real honest context, i.e. not sharing a scary AI with a cautious world, or promising functionality a product can’t achieve. This approach is the only way to frictionlessly achieve next generation outcomes. In a bid to get Bing out the door, it seems Microsoft might have forgotten this adage.

From a marketing perspective, everybody is talking about Bing, and in that way this rollout could be deemed a success. But this technologist has to wonder, is the damage to AI sentiment wort it?

Wrapping Up

So then, need we fear Bing AI? The short answer is no. These types of LLM are not dangerous in the way some commentators suggest – we certainly aren’t going to be seeing Bard and ChatGPT seizing control of our nuclear defences any time soon.

What is a risk though, is how much media frenzy can spring up from any AI behaviour users see as out of the ordinary, creating widespread shock and horror in a culture primed by years of sci-fi entertainment that casts AI as a malevolent force. When looking at strange chatbot transcripts through this lens, it is easy to see how so many users and news outlets end up feeling alarmed. What is clear is that in order to avoid poisoning the well the discourse surrounding new LLMs must be tempered by real information from real experts. We as technologists have to help share the truth, in order to ensure ill-informed media panics don’t stymie the development of one of the most exciting and applicable AI use cases. 

It is also unfair to lay the blame for the Bing chatbot discourse squarely at the feet of a ‘misinformed’ userbase. AI providers need to be really careful about releasing models where these kind of responses are still a reasonably probable outcome, even in beta. No amount of disclaimer text is going to do much to reassure the technician who thinks they’ve caught a glimpse of the ghost in the machine.

The eyes of the world are firmly trained on AI technology, and in this context considered, careful stewardship is essential in making sure such powerful technology can be used to its fullest potential, without terrifying twitter. Bing chatbot can appear quirky, malevolent, or funny – but it certainly is not capable of evil.

To learn more about the Bing platform, follow Bing on Twitter, Instagram, or Facebook. Alternatively, head to the Bing Waitlist Page to get experimenting with the tool.

You can also join the conversation by commenting below – or click here to discover more Distributed Analytics AI blog posts.

Leave a Reply

Your email address will not be published. Required fields are marked *