I agree this is challenging, but would also say that there’s ways LLMs could help here too. Consider my previous example of the call center - an AI “call summarizer” is doing the work of producing documentation based on the conversational inputs of the call center rep and the customer. This could be extended to aggregate and encode any interpersonal communication.
Today we want that call summarizer to produce something human readable. Is there some future where instead of summarizing it, it encodes it in a way that it becomes a “leaf” on a “tree of knowledge”? This goes back to @joepairman’s point about explicit knowledge structures as well.
Couldn’t agree more. That documentation’s value isn’t inherent, it’s derived from creating value somewhere else (e.g., your “knowledge tree” helps you reduce customer service times from an average of 8 minutes down to 4 minutes, or reduces calls altogether by creating a bot that can answer the questions).
Finally, this takes me to a point that came up in conversation with a colleague today. There’s different value in both “common knowledge” and “uncommon knowledge,” and your method for extracting it would differ. You could imagine gathering “common knowledge” through simple interview bots that ask the same questions to many people (e.g., ask every software engineer in a particular function “How do you set up a CI/CD pipeline for [area]?” or “How do you deploy an application to kubernetes?”). The shape of this common knowledge would likely converge strongly if it’s widely known and well understood. Just to continue butchering my tree metaphor, this would be a highly shaped tree in a well manicured garden. When tougher questions get asked and the answers are more diversified (or just messier) - that’s when you bring in humans to apply something like CTA to “create” an artifical tree instead of letting the answers shape it organically.
Distilling it down into a single question - is it naive and overly simplistic to think you can build a knowledge corpus with a combination of crowd sourcing and more directed CTA?