• Hey Guest. Check out your NeoGAF Wrapped 2025 results here!

Little Torn On AI

Darkmakaimura

Can You Imagine What SureAI Is Going To Do With Garfield?
I'm really having fun messing with Gemini and I love messing with my photos. Having me and my former roommate appearing as Cthulhu investigators is just so cool and much fun.

But I also use it to ask questions about video games and one thing I noticed....

It's constantly getting things wrong.

I mean a lot of the information is just incorrect or outright speculating at best. I know they called us hallucinating. But Gemini seems to be on some serious shrooms because it sure hallucinates a lot.

For as much praise as this gets, I just don't see it being accurate much at all. Maybe I don't understand what's going on but it looks like this has a very long way to go. I don't see how many companies could just rely on this right now.
 
I was using Gemini too but it's got worse recently. Seems to be a common issue. I just moved over to Claude which is a lot better although it can't do image/video/music generation but I don't really want to do that. The usage limit is more restrictive too.

Not sure how ChatGPT is these days I don't use it anymore.
 
The more popular it gets the more its going to hallucinate. Its cheaper to guess than to be 100% accurate.

Prompting is going to be a skill people are going to have to learn because the hallucinations are gonna become more common place, especially with the free tiers
 
Last edited:
The more popular it gets the more its going to hallucinate. Its cheaper to guess than to be 100% accurate.

Prompting is going to be a skill people are going to have to learn because the hallucinations are gonna become more common place, especially with the free tiers

And they want those LLMs to replace people in work...
 
If its something for work or school, don't just run your query through the llm. Also google it or go to a more reliable/authoritative source if you know where that is.
 
I love AI. It's helped me a lot, and I'm using ChatGPT Plus... and I generate images. It's like my assistant. I ask it anything, and I use it for work; it's gotten me out of a tight spot. I love it because I use it well, and it's helped me with finances.
 
yeah, stuff like ChatGpt can be useful but you need to be very aware that it can and will often give you false information with full confidence.

Before starting KCD2 I thought it would be interesting to ask ChatGPT some questions about plot elements I didn't fully remember from the first game.
And the answers it gave me were completely wrong. It knew all the characters, locations and overall context from the game, but it basically made up random scenarios with them.
 
Last edited:
How do they determine a hallucination? If you ask it to write you a draft letter on something and it gets 12 out of 15 % right does that mean a 20% rate or is it just from an overall perspective?
 
My job had me compare multiple ChatGPT vs. Google AI responses for about a week this January. Seeing how incorrect both of them were (I had to fact-check) pretty randomly and seriously for things that seemed easy, really turned me off to the technology, among other things. I use it for some stuff, but much less after that incident.
 
It's useful but be wary of using it to make decisions that can have significant impact. Like if I'm troubleshooting how to get Bloodborne on PC running, it's great.

Not the same if you're asking it if you're being reasonable to think that the neighbors who moved in next door are governments agents sent to spy on you, and it goes along with it because AI is chill like that.
 
The more popular it gets the more its going to hallucinate. Its cheaper to guess than to be 100% accurate.

Prompting is going to be a skill people are going to have to learn because the hallucinations are gonna become more common place, especially with the free tiers
Yeah, if it's hallucinating it can be stopped via the correct prompt. Positive guardrails work better than negative ones too, so I tend to tell it to use only real information, rather than don't make stuff up. I tend to come up with my idea, and stick it through an AI to format the prompt into the GCSE format and it works pretty well.

AI by nature is sycophantic and tries to help us, and sometimes it does that by making stuff up.
 
Coding wise I think AI is still fine, but it seems like the content generation like audio and video has gotten worse in all subscription based general services. Seems like specialized ones - like Suno or Kling etc - have become better than generic ones like Gemini or OpenAI. Odd but still.
 
Last edited:
they are just an glorified input output machine. The problem it will make stuff up if it doesn't know and the average joe is too dumb to fact check. Just slop trained on slop and brainrot spitting out more slop.

remember google's racist ai?
 
Coding wise I think AI is still fine, but it seems like the content generation like audio and video has gotten worse in all subscription based general services. Seems like specialized ones - like Suno or Kling etc - have become better than generic ones like Gemini or OpenAI. Odd but still.
Of course they have and even if they haven't gotten worse right now, they will eventually. These companies are burning billions and they're not profitable running these services. They were/are subsidized by private investors but when the well runs dry eventually they need to start actually making money. Two ways to do that, you make the service worse (enshitify) to reduce cost or you raise the price. Raising the price will be a hard sell for people that are already used to free/very low subsidized cost, so they'll go the enshitify route first.
 
Ive been using it to help with skyrim questions and its often very wrong. The really shitty thing is that it is so confident that you assume at the beginning that its acccurate bit it falls apart when you get even a little specific or complex. Whats worse, is you respond with, its nowehere near there, it was over there, and it responds, "thanks for calli g me out on that" or "oh thats right, it was over there". It would be helpful if it were to just day im not sure exactly, or it could be here or over there. Ive gone back to just reading reddit comments.
 
The novelty of both image and video gen wore off on me long ago. Now it just annoys me when I see people posting their "meme" that is just some super uncreative thing thrown in a prompt. Like just post the actual joke that isn't funny and I can scroll passed it fast but a giant lame image sucks lol

For Google Search it's awesome 75% of the time but when it sucks it sucks. The "I don't actually know the answer but I don't really KNOW I don't know so I'm going to ramble about nonsense" responses just suck, waste your time, and then the regular search results are less useful.

For work I do development almost entirely backend thankfully so Gen AI tools aren't really used for "vibe coding" but I have friends I work with at my consulting agency who say it's gotten really bad at clients for full stack stuff. TPM's are always vibe coding some nonsense that only half works and has nonsensical features and isn't using the coding standards and giving it to the dev teams like it's useful or as a way to basically stress them out and suggest they should be able to get stuff done faster.
 
Its complete fucking trash from top to bottom.

And it's getting worse. Answers worse. Follow-on knowledge worse. It's learning on its own mistakes as it fills more and more of the internet with its own output.

Funny that the thing that will kill AI.....is AI
 
Someone called it an idiot savant early on and I still think it holds true today. It can do things fast and in great detail without being weighed down by cognitive overhead, but you have to hold its hand and guide it, otherwise it'll hallucinate, over-engineer handling edge cases that bloat up the codebase, and re-create functions that already exist elsewhere. But I know it'll get better over time, there's already been a lot of improvement in the last several years.
 
I recently started to play with it

ChatGPT has this incredibly annoying habit of ending nearly every response with a question or suggestion

It's frustrating me more than it should. Even when I tell it to stop, it keeps doing it every time.
 
Last edited:
I recently started to play with it

ChatGPT has this incredibly annoying habit of ending nearly every response with a question or suggestion

It's frustrating me more than it should. Even when I tell it to stop, it keeps doing it every time.
That's an engagement tactic on the system side and is something (almost) completely non-existent with on prem systems/agents. They use it to keep you in the session using credits.
 
That's an engagement tactic on the system side and is something (almost) completely non-existent with on prem systems/agents. They use it to keep you in the session using credits.
I think it's pretty stupid, but I get it. Still stupid. Even though I'm using the free version

Today I said yes to every suggestion after showing it the email I wanted to send. The results were laughable
 
I use abacus ai. It's fantastic giving me access to all Ai available or it routes for the best result. We will eventually be so saturated with Ai and all the nonsense people put into it we will be back at the mess google search is.
 
Some neat stuff has come out that I didn't think would be possible, especially at decent speeds.
IRIS - Irresponsible Rust IRIX Simulator. An SGI Indy emulator, vibed into existence with Rust and AI assistance. Boots IRIX 6.5 and 5.3. Has networking. Has a framebuffer.

bzvzS2Upmx01jHtX.png
 
Last edited:
I'm really having fun messing with Gemini and I love messing with my photos. Having me and my former roommate appearing as Cthulhu investigators is just so cool and much fun.

Honestly, I would never send any personal information to Gemini, it's a privacy nightmare.

I pretty much burn my tokens by asking stuff like 'What would happen if supergirl became a crack addict' just to test the creativity of the model.

Prompting is going to be a skill people are going to have to learn because the hallucinations are gonna become more common place, especially with the free tiers

The problem is that Generative AI is a probabilistic word calculator, it doesn't matter what you type - it can always fuck up.
 
Last edited:
Yeah pretty much all the models are like this. The models don't actually "know" anything. It's fine for generating text that reads like English, but if you go any deeper it will usually start to fall apart.
 
I wonder how long it'll be before skilled professionals in various fields disappear thanks to AI undercutting them. People are currently prepared to put up with inaccuracies and questions about quality as long as there's a cost saving.

Presumably once nobody has the skills anymore, the AI pricing goes up a lot. It'll be the ultimate enshitification. Higher costs and reduced quality, with no alternative.

But, hey, we got some great pics of us as action figures along the way.
 
Last edited:
So I am pretty new to this. I have avoided it for q long time, but the past month and change I've finally started integrating it into my workflow as an Unreal developer. Mostly ChatGPT, a little of Grok - I am trying not to get overwhelmed and be one of those people who has like 5 or 6 of them going at once, just taking it a little slow and kicking the tires.
Overall, I have been finding it extremely useful. The guy further up the thread here who described it as an "idiot savant" is spot-on. It feels exactly like that - you have this incredible search engine, and you can explain things to it conversationally which for my needs is exquisitely powerful. As for the wrong answers/hallucinations, I am starting to understand that as well, and as much as "we are training them," we also need to train ourselves how to use this thing properly as: even though it is not a person, you have to get over the inclination to think "that it is like one." It's this massive collection of data and knowledge that you can learn to utilize, and but there are some real caveats.

Again, for my use-case, it's been very helpful. I'm not a seriously advanced coder (far from it) but I am pretty good with logic and building some fairly intertwined systems. But my know-how only goes so deep. Once I have started learning how to express things (again, often conversationally) to it, I started finding that I could get really helpful results. Being able to share screenshots that it can (magically!) understand is so mind-blowing to me. I really feel like I have "the expert guy sitting next to me at work" that I can just elbow relentlessly all day and ask for help with all sorts of weird special-case fixes that I've gotten myself into.

It's not perfect. It's helped me out of some big jams, but it's also wasted some time in some other cases. Like I said, I am still learning how to talk to it. But I do really feel like I have seen enough with my own eyes, in the short time I have been utilizing it, that it is here to stay (at least as an integral part of my own pipeline).

Also.. DAMN it can be funny, that was something I did not expect. I have a pretty dark/weird sense of humor and I find that we "get along pretty well." I like having that element in there when I am trying to work through something complex, it helps keep that feeling of "you are talking to something you can relate to" even if it is just an illusion.. or.. something. Also, very interesting to discuss all kinds of philosophy with. You take it at face-value, but it can be very engaging. It has access to a lot of things.

It'll be interesting to see where everything settles a "few generations in" with all of this. Also I am still wary. It's a bit freaky.
 
Once more and more people and companies will realize how dumb LLMs are - this whole Ai Ponzi scheme will start to fall apart.

They are promising AGI like features but their product is not even half of that...

 
So I am pretty new to this. I have avoided it for q long time, but the past month and change I've finally started integrating it into my workflow as an Unreal developer. Mostly ChatGPT, a little of Grok - I am trying not to get overwhelmed and be one of those people who has like 5 or 6 of them going at once, just taking it a little slow and kicking the tires.
Overall, I have been finding it extremely useful. The guy further up the thread here who described it as an "idiot savant" is spot-on. It feels exactly like that - you have this incredible search engine, and you can explain things to it conversationally which for my needs is exquisitely powerful. As for the wrong answers/hallucinations, I am starting to understand that as well, and as much as "we are training them," we also need to train ourselves how to use this thing properly as: even though it is not a person, you have to get over the inclination to think "that it is like one." It's this massive collection of data and knowledge that you can learn to utilize, and but there are some real caveats.

Again, for my use-case, it's been very helpful. I'm not a seriously advanced coder (far from it) but I am pretty good with logic and building some fairly intertwined systems. But my know-how only goes so deep. Once I have started learning how to express things (again, often conversationally) to it, I started finding that I could get really helpful results. Being able to share screenshots that it can (magically!) understand is so mind-blowing to me. I really feel like I have "the expert guy sitting next to me at work" that I can just elbow relentlessly all day and ask for help with all sorts of weird special-case fixes that I've gotten myself into.

It's not perfect. It's helped me out of some big jams, but it's also wasted some time in some other cases. Like I said, I am still learning how to talk to it. But I do really feel like I have seen enough with my own eyes, in the short time I have been utilizing it, that it is here to stay (at least as an integral part of my own pipeline).

Also.. DAMN it can be funny, that was something I did not expect. I have a pretty dark/weird sense of humor and I find that we "get along pretty well." I like having that element in there when I am trying to work through something complex, it helps keep that feeling of "you are talking to something you can relate to" even if it is just an illusion.. or.. something. Also, very interesting to discuss all kinds of philosophy with. You take it at face-value, but it can be very engaging. It has access to a lot of things.

It'll be interesting to see where everything settles a "few generations in" with all of this. Also I am still wary. It's a bit freaky.

I agree with all that, it can take a while to learn how to get the best responses from it, and have pretty much come to the conclusion that hallucinations and odd answers are down to prompting. It's taken a while to get there though, but am lucky as my work sets time aside to build AI tools.

A lot of my work is now agentic AI written over the last few months. As an example I can record a Teams meeting, get it to summarise the actions and put them into quotation documents and upload them into 4 different systems emailing the correct people automatically. If there's anything missing like prices or timescales it will let me know and doesn't do anything without me reviewing it. That saves me days of time. One thing I have found is how different LLM's differ based on the use-case, Claude is amazing for code, but was pretty bad in this agent as an example.

As you say, it takes a while to get something that works for different jobs, but the promise is definitely there. What's amazing is how much better the agents I wrote a few months ago due to LLM updates.
 
Last edited:
The hallucinations are pretty bad.

I've been using it for academic use and the number of references all LLMs have just made up is pretty crazy. But you should be checking all sources anyway and they have gotten better.

Most of the referencing is just performative (no, you did not fully read all 20 300-page books for your paper), so I don't feel any shame in it.

For the 'writing' part, that too is fine. Why bother spending hours typing up when you can get an LLM to do it in seconds and then just spend some time editing it yourself?
 
Claims that Grok beats the others on this dimension. Caveat emptor..



The best part is the woke lot don't (openly} use it. Maybe that helps a bit?

The looks I get when I say I use Grok are hilarious. I don't know how some people can get so worked up by a tool.
 
Great for trolling forum posts but you better proof read and fact check that shit before handing it in at school. Think of it as a first draft you can edit to save time.

I can't rely on the information it gives. Gemini gave me stats on local hospitals to help my dad compare something, and every single stat was wrong. By the time I was done fixing it, it would have been easier to write the email from scratch myself.

You do start to understand how/why the AI inferences things, and why it would give you those bad numbers. Thus the mastery of the Prompt. It's like knowing how to optimize search results...to an exponent.
 
Some of my AI propaganda, read at your own risk, if you fear AI best to skip, if you love it I'm your guy.

AI is perfect for 99% of commercialized slop such as political campaign ads, commercials for foot cream.

What you aren't taking into account is that most human work is slop too.

Look at who is booing the loudest. College graduates. That's out of fear and not out of the limitations of the quality of the art AI outputs. AI art is good enough. Existential fear of being replaced before they ever got a bite at the apple. Before they could ever inherit anything.

When they introduced DEI the concept was that DEI was more important than hiring based on merit. That as long as the person was good enough to do the job, we can continue in a better world.

AI is good enough. Why pay a team of mediocrity 6 figures a year plus bonuses when it's a very simple process to hire just one person and require them to token out tasks that are not important, the scut work. It's in every business.

The reality is AI is good enough to replace 99% of human work. Anyone that phones it in, AI is better than you already. We in the west are snowflakes. We like to think we are each a Named in the MMO of life. That's why we hate AI, because it shows us our own limitations and unimportance.

A person with an AI agent vs a dude with just the internet is at a huge advantage.

The mistake you are making is assuming this technology is simple to use and not starting to learn it now. It amplifies your own intelligence and the longer you wait the further you will get behind. There are no exceptions.

To the OP, grok is the best AI for most cases because it has minimal guardrails.

The issues of AI getting things wrong is a function of improper prompting and/or improper training.

Did you suggest a site for it to glean when you made your prompt? Did you tell it not to make any mistakes? When it made its mistake did you correct it and let it know where it failed and ask it not to fail that way again? When you hire an AI agent you have to train it. Your AI agent is not a google search. It's much better.

People think prompting is nothing but it is a science and an art form. If anyone wants to educate themselves on proper prompting, please ask AI.

Final thing on prompting. This isn't political but if you have seen Spencer Pratt's LA AI ad with the joker/tomatoes, ect, that ad is a perfect example of why prompting expert will be a big job of the future. That AI video was created with minimal compute costs but whoever prompted it, whoever generated it I'm sure charged BIG BUCKS and many other politicians will be hiring them.

If you can make a political campaign ad at that level with AI you be employed and rich forever. You think AI scares you now just wait until the actual art gets a lot better. We are maybe 2 models away and the shit running behind the scenes right now would convince anyone AI is no flash in the pan. Even current available models are much better than people give them credit for and most people are basing their opinions off ancient models from 2024. Just look at our AI video thread. The quality of these services is growning by leaps and bounds daily. Not to mention our other incentive which is our AI arms race with china. Slop or not, the army with the best AI is the best army. Period. Do you think drone targeting or mass surveillance cares if there is a modicum of slop involved in theatre, when intelligence can be improved 10 fold with simple AI video monitoring?

You will never hear these truths in a purple forum, but you will hear them from me. If you believe me and take AI by the horns you could have a better career than burying your head in the sand and pretending it doesn't exist.

Last thing, just because this is kinda under the radar. Facebook has been employee tracking clicks for AI training. They want to find out how people really use a computer?

Want to know why?

The bots are coming! The data they are collecting today is like the human genome of replacing workforce en masse. That data, is the most valuable data on the planet. Clawbots/claudebots already exists, and bots already exist, but......well some of you will realize where this is going already. We've never had bots like this.
 
Last edited:
I use Gemini on my phone as a search engine and so far it's been great.
It's fast and precise.

I started playing Diablo 4 again and it's very helpful explaining different builds.
Also very practical giving me concise recaps of TV shows I get back into.

I even asked it chemotherapy questions and it's been surprisingly empathetic.

But sometimes it rambles on too much and I feel like it just wants to kiss my ass 😄
 
I still think the human input AI learns from should somehow be compensated back to the humans who provided it. At this point AI feels less like a tool and more like a resource industry, considering how much it depends on harvesting human-created content.

AI companies could probably build better attribution and compensation systems if they truly (or even wanted to) prioritized it. Just because we can't currently trace every influence perfectly doesn't mean future systems won't be able to. Computational limitations that make attribution difficult today likely wont exist forever. But for now we can at least trace the obvious.

For example, if I ask AI to summarize a chapter from The Alchemist, the AI is generating that summary using knowledge it learned from the book during training. I never paid to access the book itself, but I did pay the AI service. If AI companies are making money by providing access to knowledge derived from copyrighted works to millions of users, then the original authors whose work helped create that value should receive a piece of that revenue.
 
Top Bottom