Giter VIP home page Giter VIP logo

Comments (18)

rfoltyns avatar rfoltyns commented on May 28, 2024

Hi @vesselinn. I can't reproduce it. RollingIndexName (actually RollingIndexNameFormatter class) takes the time as-is from LogEvent.getTimeMillis(). I've just unit tested the weekly rollover. 2 events everyday: 1 daily, second one with random time every day just to try to 'confuse' the formatter, from January 2020, 3 years ahead. It rolled over exactly every 7 days, calculating index name correctly every day. Could you provide me with a test that can reproduce this behavior?

Are the logs in your indices have correct timeMillis values? Given that you can create an index pattern per index, you should be able to verify that documents are indexed for more than a week in a single index.

from log4j2-elasticsearch.

vesselinn avatar vesselinn commented on May 28, 2024

Hi @rfoltyns. Let me give you examples from our kibana:

  • First log with index 2020-26:

Date: 2020-06-22 14:59:07.986
timeMillis: 1592827147987

  • Last log with index 2020-26:

Date: 2020-07-04 23:59:13.271
timeMillis: 1593896353271

  • First log with index 2020-27:

Date: 2020-07-05 00:00:13.572
timeMillis: 1593896413572

Thanks

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

How consistently does that happen? Does it happen right after the deployment or after few weeks of JVM running? How many documents do you send to ES per day? Do you send logs immediately after they're produced or do you replay/re-process them? Do you have any post processing defined in ES? Pipelines? Do you have any index management defined in ES? Can you reproduce this in a test?

from log4j2-elasticsearch.

vesselinn avatar vesselinn commented on May 28, 2024

Hi, When we noticed the problem we left one server (JVM) to run without restart for 3 weeks (you can see the result).
image
On this picture you can see how many logs we send to ELK. Actually we use java util logging to produce logs, but they are handled and transformed to log4j2 logs and send to ELK via elasticsearch asynch appender (shown in my first post). We don't have mechanism to re-process them. I've also unit tested the behavior of weekly rollover and it was running normally. Could it be a problem with WebLogic server as all our apps and the Handler are deployed on it? When we run WebLogic all log indexes are in correct format and week of the year, but they are not changing on time - like Sunday 00:00 or Monday 00:00. We don't have any simultaneous scenario for changing the index. Also we don't have post processing and pipelines defined in ES, only mapping template for types of the data.

from log4j2-elasticsearch.

vesselinn avatar vesselinn commented on May 28, 2024

I've also found log4j2 ticket mentioned in org.apache.logging.log4j.core.appender.rolling.PatternProcessor.getNextTime() / log4j-core-2.11.1jar. Could it be something related to the issue?

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

Given a fairly constant daily rate, sth apparently went wrong on 22nd June. Could send me the output of this query?

curl http://somehost:9200/_cat/indices/log4j.2020-*?v=&s=index

Ticket you mentioned describes issues on year change, but maybe it's worth looking into it.

BTW, If you're transforming the logs from java logging, you can actually use AsyncBatchDelivery directly, this way you'll have full control of the target index name (and maybe discover an issue with it a bit easier). Also, it might flesh out some timestamp issues.

from log4j2-elasticsearch.

cuneytcalishkan avatar cuneytcalishkan commented on May 28, 2024

Hello @rfoltyns ,

We are also using this project and we've come across the same problem.

Here is a unit test to reproduce the issue and I guess I have found why the problem happens.
The nextRolloverTime is calculated at construction time of RollingIndexNameFormatter and if this is the only instance that is used for each message before logging, we end up with this error.
If we create a new instance each time, then the index is calculated correctly.
Please take a look at the unit test code below and let me know if I could help to solve this issue or if this test makes sense at all.

@Test
  public void testWeeklyRollingIndexPattern() {
    Long sundayBeforeMidnight = LocalDateTime.of(2020, 9, 20, 23, 59, 59)
                                             .atZone(ZoneId.of("GMT"))
                                             .toInstant().toEpochMilli();
    RollingIndexNameFormatter.Builder builder = spy(RollingIndexNameFormatter.newBuilder());
    when(builder.getInitTimeInMillis()).thenReturn(sundayBeforeMidnight);
    builder.withIndexName("index");
    builder.withPattern("yyyy-ww");
    builder.withSeparator("-");
    builder.withTimeZone("GMT");
    RollingIndexNameFormatter formatter = builder.build();

    LogEvent logEvent = mock(LogEvent.class);
    when(logEvent.getTimeMillis()).thenReturn(sundayBeforeMidnight);

    String formattedIndexName = formatter.format(logEvent);
    Assert.assertEquals("index-2020-38", formattedIndexName);

    Long mondayAfterMidnight = LocalDateTime.of(2020, 9, 21, 0, 0, 1)
                                            .atZone(ZoneId.of("GMT"))
                                            .toInstant().toEpochMilli();

    when(builder.getInitTimeInMillis()).thenReturn(mondayAfterMidnight);
    LogEvent newWeekEvent = mock(LogEvent.class);
    when(newWeekEvent.getTimeMillis()).thenReturn(mondayAfterMidnight);

    String nextWeekIndex = formatter.format(newWeekEvent);
    Assert.assertEquals("index-2020-38", nextWeekIndex); // should have failed and produced index-2020-39

    formatter = builder.build(); // creating a new formatter instance

    nextWeekIndex = formatter.format(newWeekEvent);
    Assert.assertEquals("index-2020-39", nextWeekIndex);
  }

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

Thank you @cuneytcalishkan!

Looks like a bug.

If you'd like to solve it, I'm more than happy to accept your help 👍 I haven't thought about any concrete solutions yet, but there are few things to have in mind:

  • we cannot create new instances as this particular algorithm it's specifically designed to reduce memory allocation as much as possible; current* and next* fields are there to ensure that in a typical scenario, calculations and allocations are made only "around" rollover time
  • rollover must be thread-safe, but it can't lock (that's why atm it falls back at RollingIndexNameFormatter:116)
  • formatter must return correct index name if logs from previous period arrived after rollover

from log4j2-elasticsearch.

cuneytcalishkan avatar cuneytcalishkan commented on May 28, 2024

Hello @rfoltyns,

So far I've managed to debug a little bit and understand what is going on behind the scenes of PatternProcessor that is created at initialization time, which actually calculates the next rollover time.

What I've found out is that, this depends on the system settings about the country/region where the first day of the week changes depending on the locale. e.g. United States, first day of the week is Sunday whereas for France it is Monday.

You can see in the below 2 unit tests which demonstrate this.

In order for returnsWeeklyRolloverIndexNameWithFrance to pass, you need to set the country setting of your OS to some location where the first day of the week is a Monday. With this setting, returnsWeeklyRolloverIndexNameWithUS will fail.
If you change the country setting to United States, then returnsWeeklyRolloverIndexNameWithUS will pass and the other one will fail.

    @Test
    public void returnsWeeklyRolloverIndexNameWithFrance() {
      final String timezone = "Europe/Paris";
      long loggerInitMillis = LocalDateTime.of(2020, 10, 8, 21, 48, 35)
                                           .atZone(ZoneId.of(timezone))
                                           .toInstant().toEpochMilli();
      RollingIndexNameFormatter.Builder builder = spy(RollingIndexNameFormatter.newBuilder());
      when(builder.getInitTimeInMillis()).thenReturn(loggerInitMillis);
      builder.withIndexName("index");
      builder.withPattern("yyyy-ww");
      builder.withSeparator("-");
      builder.withTimeZone(timezone);
      RollingIndexNameFormatter formatter = builder.build();

      long sundayBeforeMidnight = LocalDateTime.of(2020, 10, 11, 23, 59, 59)
                                               .atZone(ZoneId.of(timezone))
                                               .toInstant().toEpochMilli();
      LogEvent thisWeekEvent = mock(LogEvent.class);
      when(thisWeekEvent.getTimeMillis()).thenReturn(sundayBeforeMidnight);
      String formattedIndexName = formatter.format(thisWeekEvent);
      Assert.assertEquals("index-2020-41", formattedIndexName);

      long nextWeekMondayAfterMidnight = LocalDateTime.of(2020, 10, 12, 0, 0, 1)
                                                      .atZone(ZoneId.of(timezone))
                                                      .toInstant().toEpochMilli();
      LogEvent nextWeekEvent = mock(LogEvent.class);
      when(nextWeekEvent.getTimeMillis()).thenReturn(nextWeekMondayAfterMidnight);
      String nextWeekIndex = formatter.format(nextWeekEvent);
      Assert.assertEquals("index-2020-42", nextWeekIndex);
    }

    @Test
    public void returnsWeeklyRolloverIndexNameWithUS() {
      final String timezone = "Europe/Paris";
      long loggerInitMillis = LocalDateTime.of(2020, 10, 8, 21, 48, 35)
                                           .atZone(ZoneId.of(timezone))
                                           .toInstant().toEpochMilli();
      RollingIndexNameFormatter.Builder builder = spy(RollingIndexNameFormatter.newBuilder());
      when(builder.getInitTimeInMillis()).thenReturn(loggerInitMillis);
      builder.withIndexName("index");
      builder.withPattern("yyyy-ww");
      builder.withSeparator("-");
      builder.withTimeZone(timezone);
      RollingIndexNameFormatter formatter = builder.build();

      long sundayBeforeMidnight = LocalDateTime.of(2020, 10, 10, 23, 59, 59)
                                               .atZone(ZoneId.of(timezone))
                                               .toInstant().toEpochMilli();
      LogEvent thisWeekEvent = mock(LogEvent.class);
      when(thisWeekEvent.getTimeMillis()).thenReturn(sundayBeforeMidnight);
      String formattedIndexName = formatter.format(thisWeekEvent);
      Assert.assertEquals("index-2020-41", formattedIndexName);

      long nextWeekMondayAfterMidnight = LocalDateTime.of(2020, 10, 11, 0, 0, 1)
                                                      .atZone(ZoneId.of(timezone))
                                                      .toInstant().toEpochMilli();
      LogEvent nextWeekEvent = mock(LogEvent.class);
      when(nextWeekEvent.getTimeMillis()).thenReturn(nextWeekMondayAfterMidnight);
      String nextWeekIndex = formatter.format(nextWeekEvent);
      Assert.assertEquals("index-2020-42", nextWeekIndex);
    }

I am currently checking with our operation team about this country/region setting of the machines.
However, even though it is already week 42 for both US and Europe, we still have logs written with index week 41 as of now.

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

Thank you @cuneytcalishkan. Great findings! 👍 I had a look at it as well and also found those Locale-based week ends. I've made a few tweaks during formatter initialization that gets your test to pass. Have a look here. I'll do some more testing today.

Also @vesselinn, YYYY-ww date pattern should be used in this case, not yyyy-ww. Otherwise you'll get incorrect behaviour at the end of the year, e.g.: 2020-01 instead of 2021-01 ~end of December 2020 (Java dates wizardry..). I'll add a note in the documentation.

It seems like the issue is caused by premature rollover. Initialising the formatter with nextRolloverTime = patternProcessor.getNextTime(...,0,..) was trying to force the rollover on first log. It was not catering for the case when first logs are arriving after factual rollover time causing it to rollover to far into the future.

Would you like to test the behaviour of 1.4.5-SNAPSHOT? I can push it out today. It should be available for download tomorrow.

from log4j2-elasticsearch.

cuneytcalishkan avatar cuneytcalishkan commented on May 28, 2024

@rfoltyns that would be great to test it with these changes if possible. Thank you very much.

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

Done. 1.4.5-SNAPSHOT - 1.4 branch with f5d18c3 - pushed to http://oss.sonatype.org/content/repositories/snapshots.

Add the repo to your pom.xml (or other build system equivalent) to get it:

<repositories>
    <repository>
        <id>oss.sonatype.org-snapshot</id>
        <url>http://oss.sonatype.org/content/repositories/snapshots</url>
        <releases>
            <enabled>false</enabled>
        </releases>
        <snapshots>
            <enabled>true</enabled>
        </snapshots>
    </repository>
</repositories>

from log4j2-elasticsearch.

cuneytcalishkan avatar cuneytcalishkan commented on May 28, 2024

Hello @rfoltyns ,
I've checked the logs on our system today and unfortunately the solution didn't work. We still see last week's number in today's messages as 2020-42. The deployment was done on Friday and we would expect to see index with week number 43 on the logs of today's messages.
We need to investigate a little bit more I guess. I will try to check again with your changes and different test scenarios.

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

Ok. If you're getting just a few events per day, could you create a test that reproduces it with exact timestamps?

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

@cuneytcalishkan I've uploaded another snapshot with isses/50 to http://oss.sonatype.org/content/repositories/snapshots.

This one looks a bit more promising. Rolling formatter test gets green once -Duser.timezone=GMT JVM arg is set.

user.timezone must match the timezone configured by the builder as PatternProcessor takes it from the env variables and FastDateFormat from the builder params - that's not great..

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

@cuneytcalishkan Did you have a chance to test the latest snapshot?

from log4j2-elasticsearch.

cuneytcalishkan avatar cuneytcalishkan commented on May 28, 2024

Hello @rfoltyns, we actually deployed new version of the code with the latest snapshot and it seems to be even worse.
Before, with a restart or a new deployment we would have the correct index. However, the latest snapshot takes the last week's number as the index. We had 2 deployments this week on Monday and Wednesday and the index is still pointing to week 44.
I haven't had time to provide some unit tests to reproduce the issue, yet. I will try to do that as soon as possible.

from log4j2-elasticsearch.

rfoltyns avatar rfoltyns commented on May 28, 2024

@cuneytcalishkan I got similar results in tests if -Duser.timezone=GMT was not set.

from log4j2-elasticsearch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.