Giter VIP home page Giter VIP logo

Comments (14)

rfoltyns avatar rfoltyns commented on May 28, 2024

Yes, it's possible.

Resource-wise - it depends. Usually it's not efficient, as every Elasticsearch appender instance spawns several threads (batch emitter, metric printers, IO threads) and if you use pooled buffers, it also allocates at least as many bytes as you configure.

It depends what you need. Could you describe what exactly you want to achieve?

E.g. if you'd like to have mixins-per-index, I'd recommend programmatic config with:

  • one AsyncbatchDelivery, one ItemSourceFactory and ObjectWriter per set of mixins and use it like this:
asyncBatchDelivery.add(indexName, itemSourceFactory.create(logEvent, objectWriter));

or

  • one AsyncbatchDelivery and JacksonJsonLayout per set of mixins (with shared ItemSourceFactory!) and use it like this:
asyncBatchDelivery.add(indexName, jacksonJsonLayout.serialize(logEvent));

from log4j2-elasticsearch.

romanmar1 avatar romanmar1 commented on May 28, 2024

I see.

I'll change the focus of my question a bit (or a lot):
Suppose I need to write logs for different components to different indices with different payloads,
and I want to be resource efficient as much as possible (for example: instantiate only 1 instance of JestClient, or 1 instance of apache async http client , with shared everything).
Mixins are not longer relevant to the question (I think i'll use pattern layout instead).

How would i do that?

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

If you want to use the same appender instance for multiple indices, for now, you'll have to write your own IndexNameFormatter tailored to your needs and derive the index name from message content or context data.

I'll think that context data based formatter with StrLookup should do the trick. I'll add it in future releases.

If you need to be resource efficient, use hc module. It allocates several times less memory than jest, it has lower CPU overhead and further improvements are on the way.

from log4j2-elasticsearch.

romanmar1 avatar romanmar1 commented on May 28, 2024

That's not exactly what i want.
Using the same Elasticsearch appender while deriving index name from context would be analogous using the same RollingFileAppender and deriving file name from context, which to me seems like the wrong kind of solution.

I want to use different appenders, possibly with different filters and everything else log4j2 has to offer, and be resource efficient. (Specifically to be able to share the same client, or delivery mechanism).

ps: Can i use hc with pattern layout? I think i tried that and it didn't work (something about using ItemSource api only).

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

You can achieve that only if you configure multiple appenders programmatically. Filters are applied before LogEvent gets to the appender, so you should be fine there.

Any layouts that produce raw Strings - including PatternLayout- are not compatible with hc, only ItemSourceLayouts are and I'm not planning to change that. JacksonJsonLayout is the only implementation atm.

from log4j2-elasticsearch.

romanmar1 avatar romanmar1 commented on May 28, 2024

Will you consider adding support for that via configuration files?

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

Short answer - no, never.

With Log4j2, each appender has it's own lifecycle - all underlying components must be reliably stopped when requested to do so and release all resources, especially running threads, connections and thread locals(!) if used. They can't leak.

With config files, even if you manage to share resources between multiple appenders, the main challenge is to identify that the last appender was stopped and shared resource can be released. Log4j2 has control over it so static registry with usage count is one of the ideas. Possible, but horrible - I'll never add support for that.

However, programmatic config will allow you to do it - this library is highly extensible by design. You can extend ElasticsearchAppender.stop() and ElasticsearchAppender.createItemAppenderFactory() methods and control the lifecycle of AsyncBatchDelivery on your own. Have a look at the architecture. Maybe you'll find a better way.

How many appenders would you like to have?

from log4j2-elasticsearch.

romanmar1 avatar romanmar1 commented on May 28, 2024

I'll probably start with 3 and go up from there, up to 10, give or take.

I understand your concerns.
I might add that the solution to this problem, at least initially, doesn't have to be complete, imho.
The demands can be more relaxed.

In the scenario where a couple of Elasticsearch appenders share a common delivery mechanism, It seems unlikely that one would want to stop some of them, and not the others. Assuming that, you can forgo the reference count mechanism and close the first appender completely, which will mean for the other appenders that they will stop working, whereupon closing them, they will dispose of resources only if not already disposed.

Making this partial capability will not jeopardize the library i think. On the contrary, you'll be adding, what i think, is a common and important feature. Elastic is very common logging platform nowdays, and using Elasticsearch appenders should be as easy as using file appenders, without taking a heavy toll on resources.

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

It seems unlikely that one would want to stop some of them, and not the others

You're right. However, it still doesn't allow a Lifecycle.stop() call on first appender to cause instability of the others.

close the first appender completely, which will mean for the other appenders that they will stop working

That's exactly why I'd rather sacrifice a few megabytes of heap, close them properly, in order, allowing all logs in flight to be delivered safely. I don't want to jeopardize your data(!) If graceful shutdown can be executed, it should be.

The demands can be more relaxed

They can as long as it doesn't lead to data loss and instability of client application.

without taking a heavy toll on resources

I agree, that's why I started to work on pooled buffers and hc delaying other key features - the difference between hc and default jest module overhead is huge.


The partial solution is there already - programmatic config with user-provided handling of AsyncBatchDelivery lifecycle.

from log4j2-elasticsearch.

romanmar1 avatar romanmar1 commented on May 28, 2024

That's exactly why I'd rather sacrifice a few megabytes of heap, close them properly, in order, allowing all logs in flight to be delivered safely. I don't want to jeopardize your data(!) If graceful shutdown can be executed, it should be.

It's logging data, and the fact that it is being channeled directly to ES, theoretically, puts it already in some jeopardy. It's a calculated risk.

They can as long as it doesn't lead to data loss and instability of client application.

Data loss perhaps, i don't see how it will lead to instability.

There are other options as well. For example, two types of elasticsearch appenders. One type is what you have now available, the other is a Decorating elasticsearch appender, that gets by ref the other type. The Decorating appender will use the inner appender (by actually decorating ItemAppender) and only needs to transfer the index name to the inner appender. This will make it quite easy, i think, to implement reference count on the inner appender. Whenever some Decorating appender get an inner appender by ref, the counter goes up, and on dispose, it goes down, on zero, it is disposed. Even simpler, you can enforce order of disposal, Decorating appenders first, inner appenders last.

The partial solution is there already - programmatic config with user-provided handling of AsyncBatchDelivery lifecycle.

Well, yes and no. I have the need to write logs not only to ES, but to other sources as well. Also , the project that i'm currently working on has already a number of log4j2 xml configurations, and I don't want the logging configuration to be present in several places (code and xml). I'm a big fan of programmatic configuration myself, but It's not an acceptable excuse to move everything to code configuration.

I agree, that's why I started to work on pooled buffers and hc delaying other key features - the difference between hc and default jest module overhead is huge.

Can you provide some absolute metrics for this? perhaps from tests that you conducted?

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

It's a calculated risk

I'm not willing to increase it.

i don't see how it will lead to instability

If in any case resources are not closed properly or closed prematurely, some classes may not be unloaded in hot-deploy environments (this applies to most dangling resources)

Decorating appender get an inner appender by ref

Options are already available with Log4j2: AsyncAppender, RoutingAppender, RewriteAppender. All of them can ref shared appender.

Here are the measurements of GC times and memory allocation rate taken from SmokeTest.programmaticConfigTest in jest and SmokeTest.programmaticBufferedConfigTest in hc, revision 946ac3e at ~10,000 logs per second (each log has 200 characters). 2 GB heap. 10,000,000 logs sent. Log is allocated once to measure library overhead.

Shenandoah (default, no tuning):

Metric jest hc
No. of minor collections 132 12
Pause interval ~8 sec ~120 sec
Total Pause Time 477 ms 104 ms
Avg. Pause Time 1.72 ms 2.60 ms
Allocation Rate 185.81 MB/sec 11.45 MB/sec
Allocated Total 187.71 GB 11.2 GB
GC Throughput 99.954% 99.99%

CMS (default, no tuning):

Metric jest hc
No. of minor collections 341 21
Pause interval ~3 sec ~50 sec
Total Pause Time 3 sec 910 ms 430 ms
Avg. Pause Time 11.4 ms 18.7 ms
Allocation Rate 172.97 MB/sec 11.17 MB/sec
Allocated Total 174.17 GB 11.06 GB
GC Throughput 99.621% 99.958%

from log4j2-elasticsearch.

romanmar1 avatar romanmar1 commented on May 28, 2024

Options are already available with Log4j2: AsyncAppender, RoutingAppender, RewriteAppender. All of them can ref shared appender.

I don't see how they solve the problem we discussed. Through them its possible to implement context based routing, but they don't solve the problem of delivery mechanism sharing with different Elasticsearch indices.

Thanks for performance metrics. It will help decide which client to use.

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

You're welcome! :) Always a pleasure to do these kind of comparisons.

As mentioned earlier, you can use your own, custom IndexNameFormatter based on context data and either:

  • RewriteAppender with custom RewritePolicy parametrized with indexName
  • or custom MessageFactory that creates Messages with indexName context parameter; not perfect, cause one may prefer to use SLF4J; however, since it's on the class level, it allows to use parametrization at any level of granularity (class, package, service, group, etc.) with properties supplier of your choice;
  • or ThreadContext (used to be MDC) which is the simplest, but requires careful cleanup (thread locals); all parametrization options apply as well

IMHO, they are reasonably safe solutions to the problem you'd like to solve. All of these interfaces are battle proven, well tested and allow you to use everything that Log4j2 has to offer.

I can add the optional IndexNameResolver with StrLookup to current formatters in 1.5. This would allow to achieve a lot with minimal changes required. Do you think that this could be of any use to you?

from log4j2-elasticsearch.

romanmar1 avatar romanmar1 commented on May 28, 2024

RewriteAppender with custom RewritePolicy parametrized with indexName

This one will do, together with the IndexNameResolver and StrLookup.
Thanks.

from log4j2-elasticsearch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.