[Image: /images/blogs/contextful.png]
The single biggest drain on your technical operations isn't the complexity of your systems; it's missing context. When your support and engineering teams don't have the full story of what a customer experienced, they waste hours trying to reproduce problems, leading to slow resolutions, frustrated customers, and costly escalations. This context gap also blinds your AI initiatives—an AI cannot analyze or automate what it cannot see.
The reason your teams are flying blind is a fundamentally broken economic model in traditional logging tools. These platforms impose a "100x Indexing Tax," charging an exorbitant premium to make data searchable—often over 100 times the cost of simply storing it¹. This punitive cost forces your teams into a dangerous compromise: they must discard 90-99% of your operational data through a practice called "sampling" just to control the budget³. Every piece of data they discard is a piece of missing context, creating the blind spots that cost your business time, money, and customer trust.
Softprobe eliminates the context gap by fixing the broken economic model. Instead of treating every log line as a separate, disconnected piece of information, we capture the entire user session as a single, complete record. We call this the "Session Graph"—a full-context, AI-ready map of every action, request, and response in a user's journey. This architecture eliminates the need for the expensive indexing tax, making it affordable to capture 100% of your data.
This architectural shift delivers three transformative business outcomes:
Our Session Graphs provide the rich, structured context that AI needs to automate root cause analysis, predict issues before they impact customers, and prevent costly escalations to your most expensive engineering talent.
By eliminating the indexing tax, we enable you to capture 100% of your data for full context, while reducing overall observability spend by over 60%¹⁰.
We end the dangerous practice of data sampling. With full-fidelity data, your teams have the complete story for every incident, closing the visibility gaps that lead to prolonged downtime and missed security threats.
Start capturing the whole story for every customer. Context-based logging turns your largest data cost into your most valuable strategic asset, empowering both your people and your AI to operate with full visibility.
To understand the need for a new paradigm, it is essential to first deconstruct the cost structure of incumbent observability platforms. These platforms, while powerful, have evolved pricing models that are complex and often punitive at scale. Using Datadog as a representative example of this index-centric model, this section will reveal the economic drivers that compel organizations toward suboptimal data practices.
Modern observability platforms are not monolithic products but rather suites of interconnected services, each with a distinct pricing metric. This multi-vector approach makes cost prediction and control exceptionally challenging for enterprises, as expenses scale along several independent axes.
The assertion that indexing is disproportionately expensive compared to raw storage can be validated through a direct cost comparison based on public pricing data. This analysis reveals the fundamental economic imbalance of the traditional model.
For this calculation, we assume a typical log message size of 500 bytes, a conservative figure given that many structured logs can be smaller.
The ratio between these two costs is stark. The cost to ingest and index 1 GB of logs in a traditional platform ($5.10) is approximately 221 times more expensive than the cost to store that same gigabyte in S3 Standard ($0.023)⁸. This quantitative analysis validates the claim that indexing costs are orders of magnitude higher than storage costs.
This punitive economic model is the direct cause of widespread engineering practices that are antithetical to the goals of true observability. Faced with unpredictable and escalating bills, engineering and finance departments are forced to treat observability data not as a valuable asset but as a toxic liability to be minimized.
This leads to a process of forced data rationing. Teams are required to make difficult, often arbitrary, decisions about what data to discard. This is typically done through aggressive filtering and sampling strategies, where only a small fraction of logs and traces are ultimately retained for analysis. Observability ceases to be a technical practice focused on system understanding and becomes a budgetary exercise in cost avoidance.
Observability vendors, aware of this customer pain point, have introduced features marketed as solutions. Datadog's "Logging without Limits™" and "Flex Logs" are prime examples³. While offering more granular control, these features are fundamentally complex cost-containment workarounds. They place the burden on the customer to perform significant, ongoing engineering work to define and manage intricate filtering rules, create multiple data tiers, and decide which data is "valuable" enough to index versus which should be relegated to less accessible archives²². This is undifferentiated heavy lifting forced upon the customer by the vendor's pricing model. The vendor profits from both the problem (high indexing costs) and the complex "solution" required to mitigate it.
The ultimate consequence of this entire cycle is the creation of pervasive visibility gaps. When a novel, intermittent, or rare "black swan" event occurs, the specific logs or traces needed to diagnose the issue have often been preemptively discarded in the name of cost savings. This leads to prolonged mean time to resolution (MTTR)⁴, frustrated engineers, and a compromised ability to understand and improve system resilience³.
The economic and technical limitations of the index-centric model necessitate a fundamental architectural rethink. Context-based logging represents such a shift, moving away from the brute-force indexing of disconnected data points toward a more intelligent approach centered on holistic, interconnected events. This section defines the core principles of this new architecture and explains how it breaks the punitive economic model of its predecessors.
The first and most crucial conceptual shift is the redefinition of the atomic unit of observability. In traditional systems, the atomic unit is the individual log line or the single span within a trace. These are treated as independent entities that must be painstakingly correlated after the fact using shared identifiers like a trace_id or request_id. This approach places the burden of reassembling context on the engineer or the query engine at the time of an investigation.
Context-based logging inverts this model. It posits that the true atomic unit of work in any system is the complete "session"—a term used here to describe the entire sequence of events, from start to finish, that corresponds to a single logical operation. This could be a user's web request, an API call, a batch processing job, or any other defined unit of work.
To represent these sessions, the context-based model employs a powerful data structure: the graph. The "SessionJSON" concept described in the user query can be understood as a serialized representation of a per-session knowledge graph.
This architectural choice represents a move from a "schema-on-write" model, where data is forced into a rigid indexed structure upon ingestion, to a more flexible "relationship-on-write" model. By preserving the intrinsic relationships between events in a graph, the system maintains data agility. New and unforeseen questions can be answered by applying new graph algorithms or traversal patterns to the existing data without requiring a costly re-indexing of the entire historical dataset. This approach not only reduces operational complexity but also future-proofs the observability investment by enabling continuous analytical evolution.
The use of the session graph is the architectural lynchpin that makes it possible to break the economic model of traditional logging. It enables a complete decoupling of data storage from the act of querying, shifting the primary cost away from indexing and onto cheap, scalable cloud storage.
This architectural separation of storage and compute is a hallmark of modern, cloud-native data platforms. Companies like Observe Inc. have built their observability platform on this very principle, using low-cost object storage like Amazon S3 for the data lake and a separate, on-demand compute layer like Snowflake for querying. This real-world example validates that the decoupling of storage and compute is a viable and powerful trend in the industry, offering a path to escape the punitive costs of the tightly-coupled, index-centric model.
The architectural shift to context-based logging is not merely a cost-optimization strategy; it unlocks a new tier of analytical capabilities. By structuring data as a graph, it provides a format that is not just "AI-friendly" but is the native language of many advanced AI and ML systems. This section explores the technical advantages of this approach, from enhancing AI model accuracy to fundamentally transforming the daily workflow of engineers.
Complex systems—be they social networks, biological pathways, or distributed software applications—are inherently graphs of interconnected entities. AI and ML models achieve higher accuracy and deeper understanding when the data they consume reflects these real-world relationships, a feat that flat, tabular data struggles to accomplish.
This approach fundamentally reduces the need for manual feature engineering, one of the most time-consuming and error-prone stages of applying ML to observability data. In a traditional model, data scientists must expend significant effort to parse logs, extract features, and attempt to infer relationships. In a context-based model, the graph structure itself is the feature engineering. The relationships are explicitly encoded, ready for consumption by graph-aware algorithms, thereby accelerating the development lifecycle and democratizing access to advanced analytics.
Graph Neural Networks (GNNs) represent a frontier of machine learning designed specifically to operate on graph-structured data. By applying GNNs to session graphs, organizations can move from reactive monitoring to proactive and predictive analysis.
The application of GNNs to observability is not a purely academic exercise. Major industry players like Splunk are actively leveraging graph analytics and GNNs to enhance their security and observability offerings, recognizing their power to uncover hidden patterns in complex, interconnected data.
The architectural differences between the two models have a profound impact on the day-to-day experience of the engineers responsible for maintaining system reliability. The debugging workflow is fundamentally transformed from a fragmented search process into a focused exploration.
The most compelling argument for any new technology architecture, particularly in the cost-sensitive domain of observability, is a quantitative analysis of its economic impact. This section presents a data-driven Total Cost of Ownership (TCO) model comparing the traditional, index-centric approach with the context-based model under a realistic, large-scale operational scenario.
To ensure a transparent and credible comparison, the following assumptions are made for the TCO model. These figures represent a hypothetical large-scale enterprise application with significant data generation.
This model calculates the projected monthly costs based on the Datadog pricing structure. It assumes an organization attempts to retain comprehensive visibility by indexing all log data for the 30-day hot period.
In this model, the log indexing cost of $150,000 constitutes over 90% of the total monthly bill, confirming its status as the dominant cost driver.
This model assumes the primary costs are for commodity cloud storage and the on-demand compute required for graph traversal and analysis.
Presenting these two models side-by-side reveals the profound economic implications of the architectural shift. While the traditional model costs an estimated $164,827 per month, the context-based model is estimated at $62,517 per month, representing a potential cost reduction of over 60%.
However, the most critical difference lies in how these costs scale. In the traditional model, the dominant cost factor is indexing ($150,000). If the data volume doubles, this cost component will also double, driving the total bill to over $300,000 per month. In the context-based model, the dominant cost is query compute ($50,000), which is usage-dependent and likely to grow sub-linearly with data volume. The storage cost, the component that grows linearly with data, is negligible by comparison. This means the context-based model possesses vastly superior economic scalability, allowing organizations to absorb data growth without facing exponential cost increases.
The economic feasibility of the context-based model enables a strategic shift that goes beyond cost savings: the move from partial, sampled data to complete, full-fidelity observability. Capturing 100% of system data eliminates the inherent risks and technical debt associated with sampling, transforming the nature of debugging and future-proofing an organization's data assets.
In response to the high costs of the index-centric model, the industry has widely adopted data sampling as a necessary evil. This practice is most prevalent in distributed tracing, where two primary methodologies are used:
Both methods, however, share a fundamental flaw: they are probabilistic. They operate on the assumption that the small fraction of data retained will be sufficiently representative of the whole. For troubleshooting rare, intermittent, or novel failure modes, this assumption frequently breaks down, leaving engineers without the data they need at the most critical moments.
Operating with a sampled dataset introduces significant risks and limitations that undermine the goals of observability.
The ability to cost-effectively capture 100% of observability data is more than an incremental improvement; it is a strategic enabler for the next generation of IT operations and analytics.
The landscape of observability is at a critical inflection point. The architectural model that has dominated the last decade—centered on the brute-force indexing of log lines—has reached its economic breaking point. The "indexing tax" has created an unsustainable financial model that forces engineering organizations into a self-defeating compromise: discarding the very data they need to ensure system reliability in order to control costs.
Context-based logging presents a viable and compelling path forward. By fundamentally re-architecting the observability pipeline around the holistic session graph and decoupling expensive query compute from inexpensive cloud storage, this new paradigm resolves the central economic conflict of the legacy model. It makes the capture of 100% of observability data not just technically possible, but financially rational.
The implications of this shift are profound:
The transition from index-centric logging to context-aware analysis is not merely an incremental improvement but a necessary architectural evolution. For organizations seeking to build resilient, understandable, and economically sustainable systems in the cloud-native era, embracing this paradigm shift will be a critical strategic advantage.
1. First call/contact resolution (FCR) - Medallia, accessed October 22, 2025, https://www.medallia.com/experience-101/glossary/first-callcontact-resolution/,2. AWS S3 Cost Calculator 2025: Hidden Fees Most Companies Miss - CostQ Blog, accessed October 22, 2025, https://costq.ai/blog/aws-s3-cost-calculator-2025/,3. Why slow ticket resolution is detrimental to company's overall performance - Shortways, accessed October 22, 2025, https://shortways.com/blog/smartticketing/ticket-slowness-detrimental-performance/,4. MTBF, MTTR, MTTF, MTTA: Understanding incident metrics - Atlassian, accessed October 22, 2025, https://www.atlassian.com/incident-management/kpis/common-metrics,5. 7 Key Takeaways From IBM's Cost of a Data Breach Report 2024 ..., accessed October 22, 2025, https://www.zscaler.com/blogs/product-insights/7-key-takeaways-ibm-s-cost-data-breach-report-2024,6. How to Reduce Ticket Response Time in 2025 - ProProfs Help Desk, accessed October 22, 2025, https://www.proprofsdesk.com/blog/ticket-response-time/,7. Log Sampling - What is it, Benefits, When To Use it, Challenges, and Best Practices, accessed October 22, 2025, https://edgedelta.com/company/blog/what-is-log-sampling,8. S3 Pricing - AWS, accessed October 22, 2025, https://aws.amazon.com/s3/pricing/,9. Customer Journey Analytics overview - Experience League, accessed October 22, 2025, https://experienceleague.adobe.com/en/docs/analytics-platform/using/cja-overview/cja-b2c-overview/cja-overview,10. The Economics of Observability, accessed October 22, 2025, https://www.observeinc.com/blog/the-economics-of-observability,11. Datadog pricing explained with real-world scenarios - Coralogix, accessed October 22, 2025, https://coralogix.com/blog/datadog-pricing-explained-with-real-world-scenarios/,12. 15 Essential Help Desk Metrics & KPIs [+ Best Practices] - Tidio, accessed October 22, 2025, https://www.tidio.com/blog/helpdesk-metrics/,13. Strategic Business Value of IT Help Desk Support - Netfor, accessed October 22, 2025, https://www.netfor.com/2025/04/02/it-help-desk-support-2/,14. How do you balance urgent support tickets with long-term IT projects? - Reddit, accessed October 22, 2025, https://www.reddit.com/r/ITManagers/comments/1mnbbcc/how_do_you_balance_urgent_support_tickets_with/,15. Ticket Handling: Best Practices for Better Support - Help Scout, accessed October 22, 2025, https://www.helpscout.com/blog/ticket-handling-best-practices/,16. How to Reduce Logging Costs with Log Sampling | Better Stack Community, accessed October 22, 2025, https://betterstack.com/community/guides/logging/log-sampling/,17. Graph-enhanced AI & Machine Learning | by InterProbe Information Technologies | Medium, accessed October 22, 2025, https://medium.com/@interprobeit/graph-enhanced-ai-machine-learning-555ca5119b80,18. Azure Blob Storage pricing, accessed October 22, 2025, https://azure.microsoft.com/en-us/pricing/details/storage/blobs/,19. Amazon S3 Glacier API Pricing | Amazon Web Services, accessed October 22, 2025, https://aws.amazon.com/s3/glacier/pricing/,20. What Is A Good CSAT Score? - SurveyMonkey, accessed October 22, 2025, https://www.surveymonkey.com/mp/what-is-good-csat-score/,21. Manage Logging Costs Without Losing Insight - Logz.io, accessed October 22, 2025, https://logz.io/blog/logging-cost-management-observability/,22. How much does SEIM logging and storage cost for your company? : r/sysadmin - Reddit, accessed October 22, 2025, https://www.reddit.com/r/sysadmin/comments/pb6vgj/how_much_seim_logging_and_storage_cost_for/,23. What is observability? Not just logs, metrics, and traces - Dynatrace, accessed October 22, 2025, https://www.dynatrace.com/news/blog/what-is-observability-2/,24. The Power of Graph Technology in AI Landscape - Mastech InfoTrellis, accessed October 22, 2025, https://mastechinfotrellis.com/blogs/the-power-of-graph-technology-in-ai-landscape,25. Technical distributed tracing details - New Relic Documentation, accessed October 22, 2025, https://docs.newrelic.com/docs/distributed-tracing/concepts/how-new-relic-distributed-tracing-works/,26. 2024 IBM Breach Report: More breaches, higher costs | Barracuda Networks Blog, accessed October 22, 2025, https://blog.barracuda.com/2024/08/20/2024-IBM-breach-report-more-breaches-higher-costs,27. An example of reconstruction and anomaly scores produced by... ResearchGate, accessed October 22, 2025, https://www.researchgate.net/figure/An-example-of-reconstruction-and-anomaly-scores-produced-by-autoencoders-trained-with_fig1_360625609,28. The True Cost of Customer Support: 2025 Analysis Across 50 ..., accessed October 22, 2025, https://livechatai.com/blog/customer-support-cost-benchmarks,29. Why integrating AI with graph-based technology is the future of cloud security, accessed October 22, 2025, https://outshift.cisco.com/blog/integrating-ai-graph-technology,30. Does Negative Sampling Matter? A Review with Insights into its Theory and Applications, accessed October 22, 2025, https://arxiv.org/html/2402.17238v1,31. CSAT Scores : How to Measure and Improve the Customer Service Experience - Medallia, accessed October 22, 2025, https://www.medallia.com/blog/csat-how-to-measure-and-improve-the-customer-service-experience/,32. Time Series Anomaly Detection using Prediction-Reconstruction Mixture Errors, accessed October 22, 2025, https://dspace.mit.edu/handle/1721.1/144671,33. Cloud Storage Pricing - Updated for 2025 - Finout, accessed October 22, 2025, https://www.finout.io/blog/cloud-storage-pricing-comparison,34. Transaction sampling | Elastic Docs, accessed October 22, 2025, https://www.elastic.co/docs/solutions/observability/apm/transaction-sampling,35. Challenges in implementing application observability | Fastly, accessed October 22, 2025, https://www.fastly.com/learning/cdn/challenges-in-implementing-application-observability