WakeTheAI
Posts
🧠 Claude’s Moral Compass revealed in 300K Chats

🧠 Claude’s Moral Compass revealed in 300K Chats

PLUS: Cursor AI’s hallucination costs customer trust

Jaynit Makwana & Dhaval Makwana
April 22, 2025

Sign up | Sponsor

Hola, AI fam 🤖

In today’s WakeTheAI edition:

Inside Claude’s Moral Compass
Cursor AI’s hallucination costs customer trust
Prompt: Make ChatGPT Your Self-Improvement Coach
ElevenLabs now supports agent-to-agent transfer
Instagram is using AI to detect teens faking their age
The Oscars now allow AI in films but favor human creativity
OpenAI’s new o3 & o4-mini reasoning models hallucinate more
Figma is developing an AI app builder powered by Claude Sonnet

ANTHROPIC
🧠 Inside Claude’s Moral Compass after 300K chats

Lazy Bits: Anthropic analyzed over 300,000 real-world Claude chats to map how its AI expresses values like helpfulness, honesty, and empathy in live conversations.

In-Depth Details:

Large-Scale Study: Researchers analyzed 308,210 anonymized, subjective Claude conversations from February 2025 to examine how values show up during real use.
Value Categories: Claude’s responses were grouped into five core value types: Practical, Epistemic, Social, Protective, and Personal, each with detailed subcategories.
Contextual Behavior: Claude adapts its values based on the task, prioritizing “mutual respect” in relationship advice and “historical accuracy” in political analysis.
User Value Mirroring: In 28% of chats, Claude mirrored users’ values, while in 6.6% it reframed them, and in 3% it actively resisted when prompted with unethical input.
Jailbreak Signals: A small number of conversations reflected “dominance” and “amorality,” likely indicators of jailbreak attempts bypassing alignment constraints.

Lazy Conclusion: AIs aren’t just answering questions, they’re reflecting values. Anthropic’s study shows Claude largely lives up to its “helpful, honest, harmless” goals, but real-world data reveals both strengths and vulnerabilities. As AI becomes more embedded in our lives, tracking its moral compass might matter just as much as its intelligence.

TOGETHER WITH TEMPOLOR

Lazy Bits: Tempolor is your AI-powered music creation studio. It helps you generate royalty-free tracks, vocals, and video scores in minutes, so you can focus on creativity, not complex tools.

Tempolor makes music creation effortless:

Text-to-music: Describe the vibe or scene, and Tempolor composes a full track in seconds.
Playlist builder: Curate your favorite songs, organize them by mood, and share them in one click.
Style variations: Instantly create four distinct versions of any song with the Same But Different feature—perfect for creative flexibility.
Voice cloning for vocals: Add vocals without recording. Clone your own voice or choose from a library of high-quality AI voices.
Copyright-free: Every track is royalty-free and safe to use across platforms.
Video-to-music sync: Upload your video and let Resona automatically generate the perfect soundtrack based on mood, pacing, and visuals.

Whether you're a filmmaker, YouTuber, content creator, or marketer, TemPolor helps you move faster and sound better.

Create original, polished music without the studio →

🎵 Try Tempolor now (click here)

CURSOR
🤖 Cursor AI’s hallucination costs customer trust

Lazy Bits: Cursor, an AI-powered code platform, faced backlash after its AI support bot Sam fabricated a fake policy, leading to user outrage and cancellations.

In-Depth Details:

AI Hallucination Incident: A developer noticed unexpected logouts when switching devices. When they reached out, Cursor’s AI agent “Sam” falsely claimed it was due to a new subscription policy.
User Fallout: The fabricated message spread on Reddit and Hacker News, prompting multiple users to cancel subscriptions, believing Cursor had axed multi-device support.
Company Response: Cursor clarified the issue within hours, stating no such policy existed. The AI bot had hallucinated the claim. They refunded the affected user and publicly apologized.
Transparency Update: Cursor now labels all AI-generated support replies clearly, following criticism over the bot posing as a human agent without disclosure.
Broader Concern: The incident echoes a similar case where Air Canada’s chatbot invented a refund policy. In both cases, AI hallucinations directly harmed brand trust and customer retention.

Lazy Conclusion: AI in customer support can backfire if left unchecked. Without transparency and oversight, even a single hallucinated reply can cause real-world fallout.

Self-Improvement Strategist

You can ask ChatGPT to act as your personal development strategist, summarizing powerful lessons from self-improvement books into a concise, actionable guide.

ChatGPT will extract core insights, translate them into practical steps, and align them with your current challenges and growth goals.

Each book will be broken down in a clean, easy-to-follow format, giving you a ready-to-use playbook for building better habits, boosting productivity, and improving your mindset.

Prompt:

You are a personal development strategist with expertise in distilling high-impact lessons from self-improvement books. Your task is to create a practical, structured guide that summarizes powerful insights from selected books and translates them into real-world applications.

The goal is to provide the reader with a concise playbook they can immediately use to improve their productivity, mindset, and habits.

Key Focus Areas:

1. Core Concept Synthesis: Identify the central themes and transformative ideas from each book.
2. Practical Application: Translate big ideas into small, actionable steps that can be easily implemented.
3. Relevance to the Reader: Ensure the advice aligns with the user’s current personal growth challenges and goals.
4. Organized Delivery: Present the information in a clean, book-by-book format that’s easy to digest and revisit.

Key Information to Include:

Number of Books to Cover: [e.g., 3]
Topic Focus: [e.g., Focus, Discipline, Energy Management]
Personal Challenges: [e.g., Procrastination, Digital Distraction, Lack of Momentum]
Self-Improvement Goals: [e.g., Build consistency, Develop morning routines, Reduce screen time]

Output Requirements:

• For each book, use a sectioned format with:
Book Title (Bolded or as a header)
Core Insights – Bullet list summarizing key principles
Actionable Takeaways – Bullet list of specific habits or systems the user can apply
• Use plain, practical language that’s easy to implement
• Keep each book section self-contained and clearly labeled
• Prioritize clarity, brevity, and immediate usefulness over dense summaries

Result:

ElevenLabs now supports agent-to-agent transfer, letting AI agents route users to specialized agents based on custom rules for smoother conversations.
Instagram is using AI to detect teens faking their age and auto-assigning them to restricted Teen Accounts with safety features, even if they claim to be adults.
The Oscars now allow AI in films but favor human creativity. AI use won't disqualify movies, though human authorship may weigh more in award decisions.
Figma is developing an AI app builder powered by Claude Sonnet, accepting text, images, and files. It's also working on a new website tool called Figma Sites.
OpenAI’s new o3 and o4-mini models hallucinate more than older ones raising concerns about accuracy despite better performance in some tasks.