Have you tested any AI-enabled cybersecurity solutions that promised significant improvements but failed to deliver measurable outcomes? What were the key lessons you learned from that experience?
Sort by:
In our organization, we have focused on internal solutions rather than external ones. We are leveraging Azure OpenAI, AWS Bedrock, and Snowflake Cortex as our LLM models. The primary use cases for these models are search, summarization, and content valuation within cybersecurity. Our experimental efforts are centered on threat intelligence (CTI), aiming to collect data from various open and internal sources and map it to our asset landscape. This helps us identify gaps and understand which assets are most vulnerable. Often, threat intelligence is abundant, but without proper mapping to our ecosystem, it becomes difficult to act effectively. We expect to have a solution in the testing phase by the end of this year, with plans to roll it out on a broader scale and integrate with tools like Snowflake and ServiceNow to streamline these processes.
At Thoughtworks, we utilize the Google SEC OPS platform for our SIEM, SOAR, and threat intelligence functions. Currently, we are experimenting with AI agent capabilities within the Google SecOps platform, specifically for malware triaging and SOAR automation. As a design partner with Google, we are actively testing these agents in our environment and providing feedback. Although these solutions are not yet in production—since Google still needs to make them available for production customers, we are optimistic about their potential to enhance our speed of execution and response. The ultimate goal is to better manage the overwhelming number of alerts received in a SOC, focusing on identifying meaningful true positives. While we are still in the trial and testing stages, we anticipate that these AI capabilities will eventually help us navigate the alert landscape more effectively.
I find that the vast majority of AI-enabled cybersecurity solutions do not do what they claim or are significantly overhyped. The best AI solution I have found is an AI security solution designed to protect our AI models, rather than having AI embedded in a tool to improve cybersecurity. We have had success with our security automation platform, which has AI built-in that is useful for creating playbooks. It is isolated within our environment and not shared, but you could do the same thing with ChatGPT—it’s just easier when it’s integrated into the user interface. It is not fundamentally different, but it is isolated to you, so your data is not shared. There are a few areas where we have found value, but for the most part, I am not seeing much that adds significant value. Even at the CrowdStrike conference a couple of months ago, there was a lot of new functionality showcased, but it is not fully developed, though it seems promising. So far, I have not found anything noteworthy that is actually useful or something we want to use. We are using AI in our business, but we are building our own large language models, which is different from buying off-the-shelf solutions.
Many vendors are selling products with an extra layer of AI wrapped around them, but it is not all there yet. We have vendors that have some generative AI built into their products, but it is still in the “coming soon” phase rather than delivering true measurable value.
We are extremely cautious about adopting AI. We have been using AI for decades, primarily in the form of basic machine learning, especially in the cybersecurity realm. Many of our tools have incorporated machine learning and basic AI for years, so we have not implemented anything that made grand promises but failed to deliver. This is mainly because our first question is always: how is this tool any different? What is this AI? We have declined many proposals from our lines of business because they appear shiny and attractive but do not offer anything different from what we already have. These solutions are not going to help our teams do their jobs better, and there are significant security questions around them. As a financial institution, we cannot allow our data to be used for training on open platforms, so we must be very cautious. What we have put in place has been very successful, largely due to our stringent due diligence process.
This is a pertinent question, as the hype surrounding AI in cybersecurity is significant. At our organization, we use Microsoft products and benchmark them against competitors, continually assessing metrics such as false positive reduction, mean time to repair (MTTR), mean time to detect (MTTD), and false positive (FP) rates. Many market products claim substantial improvements, such as 60 to 70% alert reduction and faster MTTR, but our testing has shown gains closer to 10 to 15%, which is considerably lower than advertised. While these tools did reduce some duplicate alerts and provided context-aware correlation across telemetry, they also missed important elements. Key lessons from our experiments include the necessity for products to be highly context-aware, incorporating metadata such as identity, asset criticality, and business processes. Without sufficient context, the models tend to underperform. Benchmarking claims and establishing baselines are crucial for maximizing tool effectiveness. Additionally, factors like data pipelines, suppression logic, and the involvement of humans in the loop significantly affect accuracy and efficiency. Automated triggers can drift into blind spots, reducing effectiveness. Overall, our comparative analysis emphasized the importance of context, data quality, and human oversight to ensure optimal ROI when investing in AI tools.