Tuomas Piippo
CTO, Founding Partner
Rework
If I had to name a single thing that has been most surprising in the rise of AI in the last few years, it would be how badly the Turing test ended up aging.
For a long time, it was treated as the cleanest imaginable threshold between the mechanical things computers do and real human intelligence. If a machine could converse so naturally that we could not tell whether it was human, surely something important had been reached. And surely, at some point, there would have been big headlines about the test having finally been passed.
But I don’t remember ever seeing those headlines, and now it’s probably too late for them. We seem to have moved past the point almost without noticing. And more often than not, the AI would now have to be made less capable, less articulate or at least a much lazier typist, to pass for a human.
That is probably what happens when the AI breakthrough arrives through language models. They are built from patterns in human expression, so the first thing they did was to become remarkably good at reproducing the surface of human conversation. The Turing test was looking for imitation, and suddenly imitation became the easy part.
This matters for AI use in fields where the answers are supposed to carry real consequences. Management consulting, for one, is almost defined by high-stakes ambiguity: important decisions, incomplete information, conflicting evidence, and people who will have to live with the outcome.
In that setting, imitation is not enough. A system that merely sounds like a consultant is a little like three children in a long coat trying to pass as an adult. From a distance, the silhouette may be convincing. The suit is there. The posture is there. Maybe the vocabulary is there too: operating model, value levers, strategic priorities, decision velocity.
But if you were deciding how to reorganize a business unit, where to cut cost, or whether to bet months of work on a new operating model, the suit and the vocabulary would not get very far. You would want to know whether there is actually a responsible point of view underneath it. Whether someone with the experience and capability to grasp the problem has actually done the thinking.
This is where LLM-generated consulting output falls short as advice. It can reproduce the surface of the profession: the tone, the vocabulary, the structure of a memo, but the style is only useful when there is a grounded read underneath it.
Put it another way: would you be happy paying a consultant who quietly pasted your problem into ChatGPT, copied the answer into a slide deck, and presented it as their own judgment?
Probably not. Not because ChatGPT is useless, but because that is not what you thought you were buying. You were paying for someone to understand the situation, weigh the evidence, take responsibility for a read, and tell you what they can actually stand behind.
That missing layer is what I mean by a point of view. The consultant is not paid merely to produce the final words on the slide. They are paid to stand somewhere in relation to the problem: to know what they have seen, what they believe follows from it, what remains uncertain, and what they would refuse to claim. If AI is to do more than draft the slide, the same standard applies.
In practice, that means the system does not treat every question as an invitation to produce an answer. It must first ask what the engagement so far actually supports.
Ask it for the payback on an initiative. If it only has cost-side observations, but no benefit measure, no time horizon, it should not invent a fluent payback story. Cost data does not become payback data just because the question was asked in payback terms. A system with a point of view says: I cannot stand behind that claim yet. The strongest adjacent claim I can support is the cost denominator, and here is what would close the gap.
Or take disagreement. Suppose the interviews say the review process is blocked because one team is overloaded, but the workflow data shows delays scattered across several handoffs. A fluency-first system may blend those into a vague summary: “capacity and coordination issues are slowing delivery.” A system with a point of view should keep the signals separate. It should say: the interview evidence points to team overload; the process evidence points to handoff fragmentation; these are not the same diagnosis, and I would not collapse them into one recommendation yet.
Or take decisions. If leadership accepts a recommendation, that decision should matter in future work. But it should not magically prove that the original diagnosis was correct. The decision becomes a constraint, while the evidence trail stays open.
That is the difference. A fluent system tries to answer the prompt. A consulting system with a point of view knows what it can stand behind, what it cannot yet claim, and what has changed because of earlier work.
At Rework, we are building Rethink to support the kind of consulting work where fluent answers are not enough. The goal is not to create another system that can produce polished business language on demand. The goal is to build an artificial consultant that can participate in the work with a grounded, inspectable point of view.
If Rethink says a claim is supported, the evidence should be inspectable. If it refuses a recommendation, the missing input should be clear. If its read changes, the reason should be traceable. If leadership accepts a decision, Rethink should remember that as part of the engagement without pretending the evidence question is closed.
That is the kind of system we would be comfortable attributing work to: one whose work can be challenged, corrected, and carried forward without pretending the system is human.
This is why we have started using the phrase subjective AI at Rework. The wording is a little risky, because subjective often means biased or merely personal. That is not what we are after.
We mean something more specific: AI that has a position inside the work. It has seen some things, missed others, made some commitments, and changed its read over time. Its judgment is not floating above the engagement. It is part of the engagement.
In fact, one of the ways I will know Rethink is working is that it will not always answer me straight away. Sometimes it should pause. Sometimes it should ask what the engagement actually supports. Sometimes it should decline to draw the process diagram I asked for, because the evidence is not there yet.
That is what we are building Rethink toward.