After you’ve finished solving the grid, you can either choose New challenge to start over or use the drop-down box to select a harder challenge by selecting a larger grid size.

How, therefore, did an AI design a game? Creator Daniel Tait utilized ChatGPT on OpenAI’s website (not the Bing version) and typed in the prompt “can you invent a logic puzzle similar to sudoku that doesn’t currently exist.”

Sumplete—originally known as “Labyrinth Sudoku”—was the AI’s solution after several iterations of response generation. Tait actually had ChatGPT create it using HTML and Javascript since he liked the way it looked. With varying success, we attempted to program the AI to design some games for us.When I asked it to advise me on that Wordle configuration, it simply provided the four-letter terms “mere” and then “gene” as suggestions.

How did Bard do?


Simply put, equally horrible. perhaps even worse.

I’ve done a lot of testing with Bing using ChatGPT against Bard, and neither one has ever performed this poorly. When asked to solve the Wordle using a hashtag format (for example, #E#E#), Bard responded, “The Wordle #384 is “Eager,” which isn’t true (that one was Showy) or consistent with the response we’re looking for.

I begged Bard to try once again, and this time it claimed to be answering Wordle #384 with the word “Voice” as the chosen response. Once more, this doesn’t match the already-known letters.

After a torturous exchange in which Bard accused me of trying to fool the riddle, I tried presenting it using underscores, like this: _E_E_.

Its initial response was “EEE,” but unless this was Dolphin slang for Beset (the right response), in which case Bard was once more mistaken.

Before saying, “You’re right,” Bard established a number of other contradictory assumptions. There isn’t a five-letter word with an E as the second and fourth letters. I was able to gaslight Google in some way.


Why is AI so bad at Wordle?


You’d think these AI chatbots could tackle Wordle with ease given that they are based on highly-priced Large Language Models like GPT4 and LaMDA. That’s obviously not the case, though. Which may also be the reason why developers of ChatGPT games like Sumplete have stayed mostly focused on working with numbers.

The University of Galway’s Michael G. Madden, a professor of computer science, argues in an article for The Conversation that due to the neural networks they utilize, “all text inputs must be encoded as numbers and the process that does this doesn’t capture the structure of letters within words.”

A deep neural network, which he describes as “a complex mathematical function, or rule, that maps inputs to outputs,” is at the heart of ChatGPT, the author explains. Numbers must be used as both inputs and outputs. Since the neural network in ChatGPT4 only works with numbers, words must be “translated” in order for it to function with them.

“A computer program known as a tokenizer performs the translation and keeps a massive list of words and letter combinations known as “tokens.” Numbers are used to identify these tokens. A word like “friendship” is divided into the tokens “friend” and “ship” because words like “friend” have a token ID of 6756. The identifiers 6756 and 6729 are used to denote them.

“Before ChatGPT4 even begins handling the request, the user’s words are converted into numbers when they input a query. The deep neural network can’t truly make sense of the letters because it doesn’t have access to the words as text.

How can this error be corrected?

Future LLMs can get around this in one of two ways. The training data for ChatGPT-4 might be expand to include mappings of every letter position inside each word in its vocabulary because it is known that ChatGPT-4 understands the first letter of every phrase.


