Example blog

Log management tools add finesse to Elasticsearch

The data and query management capabilities of the log management software vendors were crucial for two companies who found raw Elasticsearch difficult to use.

Log management was primarily practiced by high-tech IT departments, but the rise of microservices applications and complex cloud-native architectures has made collecting detailed log data a common requirement for mainstream businesses.

The ELK stack, which includes Elasticsearch for querying logs, Logstash for collecting and managing log data, and the Kibana data visualization tool, is a widely used utility for collection, indexing, and querying log data. While versatile in their raw form, Elasticsearch and the ELK stack can be difficult to manage for IT professionals who do not have deep expertise in its native query language and log data structures.

This is where log management software vendors LogDNA and Logz.io have come in for the past 18 months for a financial services company and webmail startup. Vendor products, which use Elasticsearch behind the scenes, include features such as easily accessible query interfaces and sophisticated log data analysis that improve DevOps collaboration and IT incident response for these customers.

“Some of the other competitors in the field … give a little more exposure to the native Elasticsearch [query] end-user, so you need to know a bit more about how Elasticsearch works to extract data from it, ”said Mark Pimentel, Cloud Engineering Manager at PlatformZero, a financial services software division of Capco , a London-based digital consulting firm. “[LogDNA] lets you search for various information through keys and tags, items in an index, and querying in LogDNA was pretty rudimentary. “

LogDNA Simplifies Queries for DevOps Collaboration

PlatformZero was initially looking for a log management software product to create a separate, controlled access data pool for developers and product managers who would otherwise not have direct access to the system logs that had been collected internally through Elasticearch. He selected LogDNA to create this data repository, in part because its simplified query interface would make information accessible to developers both conceptually and logically.

LogDNA Enterprise software combines a proprietary message brokerage service called Buzzsaw with an Elasticsearch back end. This system manages log analysis, a process that sorts log files into coherent chunks of information that are easier to manipulate, store, and search. It also presents its own query interface to end users through a web UI that PlatformZero staff found easier to use than the native Elasticsearch query language, Pimentel said.

LogDNA is simple enough to be used by application developers new to infrastructure management and the ELK stack, as well as version managers who work with developers to assess the success of software deployments. But it’s also sophisticated enough to be used by company site reliability engineers (SREs) in tandem with an APM SignalFx tool for incident response.

LogDNA introduced a feature called Usage Quotas in March that limits the output of data from various services when users query them, in order to reduce the cost spikes associated with extensive data searches. PlatformZero rolled out this feature to production shortly after its introduction.

“It doesn’t reduce costs so much as it makes them more predictable,” Pimentel said.

The company used SignalFx prior to its acquisition by Splunk, and while ease of use with LogDNA’s tool is paramount, Pimentel said the company would like to see the provider add some of the advanced log management features that it does. other competitors offer its track record. These include AIOps and other sophisticated log analysis functions such as post-ingestion indexing.

In addition to usage quotas, LogDNA has data management features such as exclusion rules, which allow teams to choose the logs they store, as well as extraction and aggregation fields, which allow users to view and export fields from previously indexed log lines. LogDNA officials did not say whether AIOps and other data analysis features were on the company’s roadmap.

Logz.io facilitates ELK troubleshooting

As a startup with a team of 15 engineers responsible for all aspects of IT, New York-based Holler sought outside help with log management after an incident in 2019.

Daniel Seravalli

“We grew fast enough to attract new partners, one of which was Venmo, and it was really difficult to have visibility on the back-end when things went wrong,” said Daniel Seravalli, chief engineer at the company, which uses GIFs and stickers. in popular web and mobile applications. “We had a surveillance stack, but it never worked properly. “

In July 2019, the business started experiencing long outages that sometimes took weeks to resolve.

“Investigating them meant collecting raw data from servers and aggregating it manually – we didn’t have dashboards that we could use as a starting point for our investigation,” Seravalli said.

It took us two weeks to really figure out what was going on… nine months later we had a similar incident, but we had Logz.io, and it took us a day to figure it out.

Daniel SeravalliChief Engineer, Holler

Then the company released a new version of its software development kit (SDK) to a major partner, and it started generating a lot more log data than the Seravalli team expected. This put a strain on the Kafka data pipeline and the company’s storage infrastructure.

“It took us two weeks to really understand what was going on there – we just didn’t have the data to spell it out,” Seravalli said. “Nine months later we had a similar incident, but we had Logz.io and it took us a day to figure it out.”

Logz.io is a Software as a Service (SaaS) provider that hosts open source observability data and visualization tools, including the ELK stack. Holler decided to switch from an internally managed ELK stack to the Logz.io version after the incident with its SDK troubleshooting in 2019, after considering Splunk, in part because Logz.io’s pricing was attractive.

Since then, Holler has also started expanding its observability tools to include distributed tracing and time series metrics, which Logz.io also offers with a Jaeger-based service released in 2019 and Prometheus as a service, which is became available in March. Holler also used Logz.io’s Grafana-based interface for metrics monitoring. Logz.io adds value to these open source tools by correlating data between them and providing direct links between their dashboards.

Logz.io’s dashboards are also preconfigured to provide key information as needed, unlike previous attempts by internal Holler developers to display data through Kibana, which Seravalli described as “blindly.”

Finally, Logz.io technical support engineers consulted Holler IT pros on how to configure monitoring of Kafka data pipelines, including creating complex log parsing rules.

“It meant a lot to us, that Logz.io was ready to help us like that, after the sale,” said Seravalli.

As with PlatformZero’s Pimentel, Seravalli would like to use more AIOps and data analysis features within Logz.io as his business grows, and he said he hopes to see Logz.io add synthetic tracing to it. its Jaeger-based services.

Synthetic tracing will likely ship next year, according to officials at Logz.io.

“We are working hard with the community to strengthen Jaeger for more and more APM use cases,” Logz.io CTO Jonah Kowall said in an email. “This contribution to Jaeger and OpenTelemetry is a work in progress… an important part of APM is synthetic surveillance, and it’s probably the next step for Logz.io.”

As Holler continues to grow, he can also add in-house operational expertise and run his own ELK stack again, which is why Logz.io’s open source tools base is important, Seravalli said. But in the meantime, working with a service provider has also saved the Seravalli team from having to deal with the Elasticsearch licensing controversies that arose in the first months of this year.

“That’s why I work with a service provider, so I don’t have to worry about that stuff,” he said. “But I also like the managed open source model, because if we’re going to re-introduce it internally in two years, we won’t have spent the last five learning proprietary technology. “

Beth Pariseau, Senior Editor at TechTarget, is an award-winning 15-year veteran of computer journalism. She can be reached at [email protected] or on Twitter @PariseauTT.

Source link