Giter VIP home page Giter VIP logo

evtx's People

Contributors

alexkornitzer avatar andrewrathbun avatar codekoala avatar dependabot-preview[bot] avatar dependabot[bot] avatar dgmcdona avatar forensicmatt avatar kazuminn avatar ohadravid avatar omerbenamram avatar robo210 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

evtx's Issues

Output Tweaking

Just a small feature request. Could you exclude the null chars in output? It breaks a lot of processing of the output.

image

Also, there seems to be a formatting issue with integer rendering when the hex value is 1 char. A space between 0x and the integer. I know they are small things, but, it helps a lot when trying to serialize for ingestion or post processing.

thread 'main' panicked at 'invalid or out-of-range date'

Hello !

We stumbled upon an error thread 'main' panicked at 'invalid or out-of-range date' while using the evtx library.
We are wondering if it's the expected behavior, and if not, is there a workaround ?
It seems that when the evtx library processes a "faulty" event, it fails and returns by throwing the aformentioned error.

Used command:
./evtx_dump-v0.7.2-x86_64-unknown-linux-gnu <filename>.evtx -f <filename>.json --no-confirm-overwrite -ojson --no-indent

Error:

thread '<unnamed>' panicked at 'invalid or out-of-range date', /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.19/src/naive/date.rs:173:51
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at 'invalid or out-of-range date', /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.19/src/naive/date.rs:173:51
thread '<unnamed>' panicked at 'invalid or out-of-range date', /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.19/src/naive/date.rs:173:51

We looked inside our evtx file with Windows Event Viewer. We found that the evtx command failed on events containing the following data:

<EventData>
    <Data Name="IdentificationGUID">{00280040-0022-0049-6400-65006e007400}</Data>
    <Data Name="ProtectorGUID">{00660069-0069-0063-6100-740069006f00}</Data>
    <Data Name="ProtectorType">0x47006e</Data>
    <Data Name="UnlockTime">1601-01-01T00:00:00.0000000Z</Data>
</EventData>

specifically on the "UnlockTime" field (see the attached image).

Things look fine by viewing the associated scheme though:

Template    : <template xmlns="http://schemas.microsoft.com/win/2004/08/events">
                <data name="IdentificationGUID" inType="win:GUID" outType="xs:GUID"/>
                <data name="ProtectorGUID" inType="win:GUID" outType="xs:GUID"/>
                <data name="ProtectorType" inType="win:HexInt32" outType="win:HexInt32"/>
                <data name="UnlockTime" inType="win:SYSTEMTIME" outType="xs:dateTime"/>
              </template>

We found topics similar to this case:

Therefore, we supposed that the raw evtx file contains an "UnlockTime" event date with a raw value of 0.
Windows Event Viewer support and display the value "1601-01-01T00:00:00.0000000Z" while the evtx library don't.
By looking at the code, we found that the library use the rust function from_ymd that can throw this error.
In this case, if any event has a wrong "UnlockTime" value, the whole evtx file cannot be processed.
If it's the expected behavior, is adding an option that allows the user to process the whole file while skipping faulty events possible as a workaround ?
If not, can an update to this using from_ymd_opt instead of from_ymd fix it ? Events will have empty "UnlockTime" data value.

In any case, thank you for your work !

Regards.

windows_event_viewer

Warnings will become errors

Came up during a quick check:

[...]
   = note: this warning originates in the macro `try_read` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: trailing semicolon in macro used in expression position
  --> src/macros.rs:49:69
   |
49 |             .map_err(|e| capture_context!($cursor, e, "u16", $name));
   |                                                                     ^
   |
  ::: src/utils/time.rs:15:24
   |
15 |     let milliseconds = try_read!(r, u16)?;
   |                        ----------------- in this macro invocation
   |
   = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
   = note: for more information, see issue #79813 <https://github.com/rust-lang/rust/issues/79813>
   = note: this warning originates in the macro `try_read` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: `evtx` (lib) generated 88 warnings
    Finished release [optimized] target(s) in 49.87s

README tweak

this statement "For single core performance, it is both the fastest and the only parser than supports both xml and JSON outputs."

is not accurate as my project supports XML and json output

image

0.6.0 introduced error

I get the following error in version 0.6.0 but not in 0.5.1.

thread 'main' panicked at 'It can only be an object or null, and null was covered', src\libcore\option.rs:1188:5
stack backtrace:
   0: backtrace::backtrace::trace_unsynchronized
             at C:\Users\...\.cargo\registry\src\github.com-1ecc6299db9ec823\backtrace-0.3.40\src\backtrace\mod.rs:66
   1: std::sys_common::backtrace::_print_fmt
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\sys_common\backtrace.rs:84
   2: std::sys_common::backtrace::_print::{{impl}}::fmt
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\sys_common\backtrace.rs:61
   3: core::fmt::write
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libcore\fmt\mod.rs:1024
   4: std::io::Write::write_fmt<std::sys::windows::stdio::Stderr>
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\io\mod.rs:1428
   5: std::sys_common::backtrace::_print
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\sys_common\backtrace.rs:65
   6: std::sys_common::backtrace::print
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\sys_common\backtrace.rs:50
   7: std::panicking::default_hook::{{closure}}
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panicking.rs:193
   8: std::panicking::default_hook
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panicking.rs:210
   9: std::panicking::rust_panic_with_hook
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panicking.rs:471
  10: std::panicking::begin_panic_handler
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panicking.rs:375
  11: core::panicking::panic_fmt
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libcore\panicking.rs:82
  12: core::option::expect_failed
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libcore\option.rs:1188
  13: evtx::model::raw::BinXMLRawToken::from_u8
  14: <evtx::json_output::JsonOutput as evtx::xml_output::BinXmlOutput>::visit_open_start_element
  15: evtx::binxml::assemble::parse_tokens
  16: evtx::evtx_record::EvtxRecord::into_json_value
  17: <alloc::vec::Vec<T> as alloc::vec::SpecExtend<T,I>>::from_iter
  18: core::ops::function::impls::<impl core::ops::function::FnMut<A> for &F>::call_mut
  19: rayon::iter::plumbing::Folder::consume_iter
  20: rayon::iter::plumbing::bridge_producer_consumer::helper
  21: <rayon::vec::IntoIter<T> as rayon::iter::IndexedParallelIterator>::with_producer
  22: rayon::iter::collect::special_extend
  23: rayon::iter::collect::<impl rayon::iter::ParallelExtend<T> for alloc::vec::Vec<T>>::par_extend
  24: evtxtools::evtxhandler::EvtxHandler<T>::get_attribute_mapping
  25: alloc::alloc::box_free
  26: alloc::alloc::box_free
  27: crossbeam_epoch::deferred::Deferred::new::call
  28: std::rt::lang_start_internal::{{closure}}
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\rt.rs:52
  29: std::panicking::try::do_call<closure-0,i32>
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panicking.rs:292
  30: panic_unwind::__rust_maybe_catch_panic
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libpanic_unwind\lib.rs:78
  31: std::panicking::try
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panicking.rs:270
  32: std::panic::catch_unwind
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panic.rs:394
  33: std::rt::lang_start_internal
             at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\rt.rs:51
  34: main
  35: invoke_main
             at d:\agent\_work\3\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
  36: __scrt_common_main_seh
             at d:\agent\_work\3\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
  37: BaseThreadInitThunk
  38: RtlUserThreadStart
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Here is the logfile that caused the issue.
E_ShadowCopy6_windows_system32_winevt_logs_Microsoft-Windows-CAPI2%4Operational.zip

could not reproduce described behavior

fd -e evtx -x evtx_dump -f "{.}.xml will create an xml file next to each evtx file, for all files in folder recursively!

Got:

error: The following required arguments were not provided:
    <INPUT>

USAGE:
    evtx_dump <INPUT> --ansi-codec <ansi-codec> --threads <num-threads> --format <output-format>

For more information try --help

Thx for this stuff, really handy tool :)

<Event> never closed

Hi ! Thank you for your work :)

I noticed that somehow the <Event> tag is never closed:

$ cargo run -- --input samples/new-user-security.evtx
Record 1
<?xml version="1.0" encoding="utf-8"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Microsoft-Windows-Security-Auditing" Guid="54849625-5478-4994-A5BA-3E3B0328C30D">
[...]
    <Data Name="SubjectLogonId">0x3e7</Data>
    <Data Name="PrivilegeList">-</Data>
  </EventData>
Record 2
<?xml version="1.0" encoding="utf-8"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
[...]
    <Data Name="LogonHours">%%1797</Data>
  </EventData>
Record 3
<?xml version="1.0" encoding="utf-8"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
[...]
  </EventData>
Record 4
<?xml version="1.0" encoding="utf-8"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
[...]
  </EventData>

Shouldn't there be a </Event> at the end of each record ?

Only gave a quick look at the code and it seems that the last call to visit_close_element (in src/xml_output.rs) returns because eof_reached is already true.

By the way, would you be interested in a json output or it's not in the scope of the project ?

tailing

Is it possible to tail evtx files? using custom ReadSeek?

wrong ordering in records returned by records() iterator

hi, i think there is a bug in event parsing regarding ordering.
the records() iterator return records appended in this way:

chunk0: record10,record9,record8,record7,record6,record5,record4,record3,record2,record1
chunk1: record20,record19,record18,record17,record16,record15,record14,record13,record12,record11
and so on ...

basically, each chunk is orderered in a descending way, and this leads to the records not being in the original order when pulled from the iterator. and this may break some utilization of your lib where the original order needs to be preserved.

pyspark integration

Hi,
i'm trying to use this library with pyspark, since it is super fast and easy to use.
Basicly i am trying to load some evtx files and convert them into json files for further processing.
It works great without pyspark, however, i am trying to run my parse function I'll get the following error:
AttributeError: type object 'PyEvtxParser' has no attribute '__iter__'

I think it might have to do with pickle serialization and pyspark, but i don't know for sure. Maybe it will work if you make the parser iterable.

Many Thanks in Advance!

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<ipython-input-5-d168f16142e1> in <module>
      1 json_str = evtx_files.map(lambda bdata: parseEvents(bdata[1]))
----> 2 json_str.top(1)


~/workspace/lib/python3.8/site-packages/pyspark/rdd.py in top(self, num, key)
   1369             return heapq.nlargest(num, a + b, key=key)
   1370 
-> 1371         return self.mapPartitions(topIterator).reduce(merge)
   1372 
   1373     def takeOrdered(self, num, key=None):

~/workspace/lib/python3.8/site-packages/pyspark/rdd.py in reduce(self, f)
    928             yield reduce(f, iterator, initial)
    929 
--> 930         vals = self.mapPartitions(func).collect()
    931         if vals:
    932             return reduce(f, vals)

~/workspace/lib/python3.8/site-packages/pyspark/rdd.py in collect(self)
    887         """
    888         with SCCallSiteSync(self.context) as css:
--> 889             sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
    890         return list(_load_from_socket(sock_info, self._jrdd_deserializer))
    891 

~/workspace/lib/python3.8/site-packages/py4j/java_gateway.py in __call__(self, *args)
   1302 
   1303         answer = self.gateway_client.send_command(command)
-> 1304         return_value = get_return_value(
   1305             answer, self.gateway_client, self.target_id, self.name)
   1306 

~/workspace/lib/python3.8/site-packages/pyspark/sql/utils.py in deco(*a, **kw)
    126     def deco(*a, **kw):
    127         try:
--> 128             return f(*a, **kw)
    129         except py4j.protocol.Py4JJavaError as e:
    130             converted = convert_exception(e.java_exception)

~/workspace/lib/python3.8/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    324             value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
    325             if answer[1] == REFERENCE_TYPE:
--> 326                 raise Py4JJavaError(
    327                     "An error occurred while calling {0}{1}{2}.\n".
    328                     format(target_id, ".", name), value)

Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, 192.168.1.123, executor driver): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 587, in main
    func, profiler, deserializer, serializer = read_command(pickleSer, infile)
  File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 74, in read_command
    command = serializer._read_with_length(file)
  File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/serializers.py", line 172, in _read_with_length
    return self.loads(obj)
  File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/serializers.py", line 458, in loads
    return pickle.loads(obj, encoding=encoding)
AttributeError: type object 'PyEvtxParser' has no attribute '__iter__'

	at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:503)
	at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:638)
	at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:621)
	at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:456)
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
	at scala.collection.Iterator.foreach(Iterator.scala:941)
	at scala.collection.Iterator.foreach$(Iterator.scala:941)
	at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
	at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62)
	at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49)
	at scala.collection.TraversableOnce.to(TraversableOnce.scala:315)
	at scala.collection.TraversableOnce.to$(TraversableOnce.scala:313)
	at org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28)
	at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:307)
	at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:307)
	at org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28)
	at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:294)
	at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:288)
	at org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28)
	at org.apache.spark.rdd.RDD.$anonfun$collect$2(RDD.scala:1004)
	at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2139)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:127)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)

Driver stacktrace:
	at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2059)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2008)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2007)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2007)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:973)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:973)
	at scala.Option.foreach(Option.scala:407)
	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:973)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2239)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2188)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2177)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:775)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2099)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2120)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2139)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2164)
	at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1004)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:388)
	at org.apache.spark.rdd.RDD.collect(RDD.scala:1003)
	at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:168)
	at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 587, in main
    func, profiler, deserializer, serializer = read_command(pickleSer, infile)
  File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 74, in read_command
    command = serializer._read_with_length(file)
  File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/serializers.py", line 172, in _read_with_length
    return self.loads(obj)
  File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/serializers.py", line 458, in loads
    return pickle.loads(obj, encoding=encoding)
AttributeError: type object 'PyEvtxParser' has no attribute '__iter__'

	at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:503)
	at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:638)
	at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:621)
	at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:456)
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
	at scala.collection.Iterator.foreach(Iterator.scala:941)
	at scala.collection.Iterator.foreach$(Iterator.scala:941)
	at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
	at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62)
	at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49)
	at scala.collection.TraversableOnce.to(TraversableOnce.scala:315)
	at scala.collection.TraversableOnce.to$(TraversableOnce.scala:313)
	at org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28)
	at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:307)
	at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:307)
	at org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28)
	at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:294)
	at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:288)
	at org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28)
	at org.apache.spark.rdd.RDD.$anonfun$collect$2(RDD.scala:1004)
	at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2139)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:127)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

add support for WEVT_TEMPLATE evtx template structure parsing

In a recent discussion, it became clear to me that there's a desire for evtx tooling that supports an offline database of templates. Here's some some relevant background on the topic:

The forensikblog.de post describes exactly my goal: to process the resource directory of PE files and collect evtx templates for subsequent use. For example, to put the templates in a sqlite database, carve evtx records from unallocated space, and render the records using the templates from the database. Going forward, I expect to use this evtx library over python-evtx, for many reasons :-).

However, getting this to work may take some changes to this evtx library. I'll describe what I find in this thread. I hope that we can work together to support all these use cases!

Incidentally, I've been chatting with @forensicmatt whose also interested in working with evtx templates, so he may chime in too.


wevt_template is my work in progress project for extracting evtx templates from PE files.

Here is services.exe (renamed to gif extension) that I'll reference below.


In the attached services.exe at offset 0xA3020 with length 0x4b7e is the embedded instrumentation manifest that includes evtx templates:

00000000:  43 52 49 4d 7c 4b 00 00 05 00 01 00 03 00 00 00   CRIM|K..........
00000010:  5b 71 63 00 da ee 07 40 94 29 ad 52 6f 62 69 6e   [qc....@.).Robin
00000020:  4c 00 00 00 97 4c 18 06 01 52 0e 48 92 af 3a 36   L....L...R.H..:6
00000030:  26 c5 b1 40 f4 08 00 00 d1 08 59 55 d7 a6 95 46   &[email protected]
00000040:  8e 1e 26 93 1d 20 12 f4 48 0c 00 00 57 45 56 54   ..&.. ..H...WEVT
00000050:  a8 08 00 00 01 00 00 90 08 00 00 00 05 00 00 00   ................
00000060:  9c 00 00 00 07 00 00 00 08 01 00 00 0d 00 00 00   ................
00000070:  d8 04 00 00 02 00 00 00 24 05 00 00 00 00 00 00   ........$.......
00000080:  a4 05 00 00 01 00 00 00 e4 05 00 00 03 00 00 00   ................
00000090:  f4 06 00 00 04 00 00 00 68 07 00 00 43 48 41 4e   ........h...CHAN
000000a0:  6c 00 00 00 01 00 00 00 00 00 00 00 b8 00 00 00   l...............
000000b0:  10 00 00 00 ff ff ff ff 50 00 00 00 4d 00 69 00   ........P...M.i.
000000c0:  63 00 72 00 6f 00 73 00 6f 00 66 00 74 00 2d 00   c.r.o.s.o.f.t.-.
000000d0:  57 00 69 00 6e 00 64 00 6f 00 77 00 73 00 2d 00   W.i.n.d.o.w.s.-.
000000e0:  53 00 65 00 72 00 76 00 69 00 63 00 65 00 73 00   S.e.r.v.i.c.e.s.
000000f0:  2f 00 44 00 69 00 61 00 67 00 6e 00 6f 00 73 00   /.D.i.a.g.n.o.s.
00000100:  74 00 69 00 63 00 00 00 54 54 42 4c d0 03 00 00   t.i.c...TTBL....
00000110:  02 00 00 00 54 45 4d 50 c0 00 00 00 01 00 00 00   ....TEMP........
00000120:  01 00 00 00 a8 01 00 00 01 00 00 00 fe be 19 ab   ................
00000130:  f0 23 65 5f 2f fd 44 4c 0b e7 4f 99 0f 01 01 00   .#e_/.DL..O.....
00000140:  01 ff ff 5e 00 00 00 44 82 09 00 45 00 76 00 65   ...^...D...E.v.e
00000150:  00 6e 00 74 00 44 00 61 00 74 00 61 00 00 00 02   .n.t.D.a.t.a....
...

Notably, this services.exe from Win10 2020H1 uses the CRIM version 5.1 (in contrast to the libexe description for version 3.1). We'll see why this matters in a moment.

At 0xA306C is the start of an event provider structure (WEVT) for Microsoft-Windows-Services/Diagnostic:

00000000  57 45 56 54 a8 08 00 00 01 00 00 90 08 00 00 00  |WEVT¨...........|
00000010  05 00 00 00 9c 00 00 00 07 00 00 00 08 01 00 00  |................|
00000020  0d 00 00 00 d8 04 00 00 02 00 00 00 24 05 00 00  |....Ø.......$...|
00000030  00 00 00 00 a4 05 00 00 01 00 00 00 e4 05 00 00  |....¤.......ä...|
00000040  03 00 00 00 f4 06 00 00 04 00 00 00 68 07 00 00  |....ô.......h...|
...

At 0xA3128 is the template table (TTBL) and finally at 0xA315C is a binary XML template structure. Ideally, we'd be able to parse the data using this evtx library. I'm currently using the following to parse the data:

        let de = evtx::binxml::deserializer::BinXmlDeserializer::init(
            &buf,
            0x0,
            None,
            false,
            encoding::all::WINDOWS_1252,
        );

        let mut iterator = de.iter_tokens(None)?;

        loop {
            let token = iterator.next();
            if let Some(t) = token {
                debug!("token: {:#x?}", t);
            } else {
                break;
            }
        }

Anyways, here is the binary template:

00000000  0f 01 01 00 01 ff ff 5e 00 00 00 44 82 09 00 45  |.....ÿÿ^...D...E|
00000010  00 76 00 65 00 6e 00 74 00 44 00 61 00 74 00 61  |.v.e.n.t.D.a.t.a|
00000020  00 00 00 02 41 ff ff 3d 00 00 00 8a 6f 04 00 44  |....Aÿÿ=....o..D|
00000030  00 61 00 74 00 61 00 00 00 25 00 00 00 06 4b 95  |.a.t.a...%....K.|
00000040  04 00 4e 00 61 00 6d 00 65 00 00 00 05 01 09 00  |..N.a.m.e.......|
00000050  47 00 72 00 6f 00 75 00 70 00 4e 00 61 00 6d 00  |G.r.o.u.p.N.a.m.|
00000060  65 00 02 0d 00 00 01 04 04 00 00 00 00 00 00 00  |e...............|
...

Unfortunately, this doesn't parse well with the code from this library. Let me explain what I see:

00000000  0f 01 01 00 BinXmlFragmentHeader{version 1.1, flags: 0x0}
                      01 OpenStartElement
                         ff ff dependency identifier
                               5e 00 00 00 data size=0x5E
                                           44 82 <<< hash???
                                                 09 00 number of characters in following wstring
                                                       45 wstring="EventData"
00000010  00 76 00 65 00 6e 00 74 00 44 00 61 00 74 00 61
00000020  00 00 00 end(wstring="EventData"0
                   02 CloseStartElement
                      41 OpenStartElement with Attributes
                         ff ff 3d ...

My guess is that in (at least) format version 5.1 (or 4+???), strings are stored inline rather than as references. I think the structure for tag 01 is maybe:

struct OpenStartElementNoAttributes {
  tag: u8,                             // == 0x01
  dependency_identifier: Option<u16>,  // 0xFFFF -> None
  data_size: u32,
  name_hash: u16,                      // unknown algorithm
  name_character_count: u16,
  name: OsString<utf16>                // name_character_count + trailing NULL character
}

This inline string strategy seems to be used in other parts of the template, too.

I think these strings share a structure with the BinXmlName described by libevtx:

0 4   Unknown
4 2   Name hash Which hash algorithm?
6 2   Number of characters
8 …​   UTF-16 little-endian string with an end-of-string character

So, I wonder if its reasonable to extend read_open_start_element to support this variant of the format. And if so, how to manage the set of features that each variant may support (evtx-file-mode vs WEVT_MODE vs ....).

In a subsequent discussion, assuming we can parse out these templates, then we can chat about how to apply the templates toward data carved from allocated space. But, I haven't gotten this far, yet :-)

[Question] Alter JSON output

Dear Omer,

Awesome work on this library, it is really blazing fast.

I hope you can help me with the following question about the JSON serializer. I would like to alter the JSON data that is outputted by the parser and I am looking for the best way to do it.

By default it outputs something like this:

{
        "Event": {
            "EventData": {
                "Binary": null,
        ...
        "Event_attributes": {
            "xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
        }
}

Which I would like to append a few properties to, e.g.:

{
        "Event": {
            "EventData": {
                "Binary": null,
        ...
        "Event_attributes": {
            "xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
        },
    "fields": {
        "host": "WIN-TEST",
        "source": "Setup.evtx",
        "time": 1623066248.0
    }
}

This should happen somewhere around this snippet of code, which returns a record which contains the data object which is already a string (from the into_json function):

            EvtxOutputFormat::JSON => {
                for record in parser.records_json() {
                    self.dump_record(record)?;   
                }
            }

The following solutions were the ones I could think off:

  1. Alter the string to insert the fields part.
  • Advantages
    • Easy to implement
    • Fast?
  • Disadvantages
    • Not flexible
    • Error prone
  1. Parse the record.data string to object with serde_json, alter it, and convert it to string again.
  • Advantages
    • Easy to implement
    • Flexible
    • Not prone to errors
  • Disadvantages
    • Compromises performance due to inherent inefficiency
  1. Implement own records_json function
  • Advantages
    • Fast?
    • Flexible
  • Disadvantages
    • I'm a terrible rust developer
    • Introduces a lot of code from your library which will be outdated
  1. insert even better solution here

I'm asking for your advise on this because I wasn't able to figure it out how to properly do it in rust, also performance is important for me so I want to find a very efficient solution.

For solution (3) I already tried to implement something but that doesn't work. Maybe you can provide some guidance or you might even have a much better solution in mind.

// Stable shim until https://github.com/rust-lang/rust/issues/59359 is merged.
// Taken from proposed std code.
pub trait ReadSeek: Read + Seek {
    fn tell(&mut self) -> io::Result<u64> {
        self.seek(SeekFrom::Current(0))
    }
    fn stream_len(&mut self) -> io::Result<u64> {
        let old_pos = self.tell()?;
        let len = self.seek(SeekFrom::End(0))?;

        // Avoid seeking a third time when we were already at the end of the
        // stream. The branch is usually way cheaper than a seek operation.
        if old_pos != len {
            self.seek(SeekFrom::Start(old_pos))?;
        }

        Ok(len)
    }
}

impl<T: Read + Seek> ReadSeek for T {}

pub struct JsonSerialize<'a, T: ReadSeek> {
    settings: ParserSettings,
    parser: &'a mut EvtxParser<T>,
}


impl<T: ReadSeek> JsonSerialize<'_, T> {

    /// Return an iterator over all the records.
    /// Records will be JSON-formatted.
    pub fn records_json(
        &mut self,
    ) -> impl Iterator<Item = Result<SerializedEvtxRecord<String>, EvtxError>> + '_ {
        EvtxParser::serialized_records(self.parser, |record| record.and_then(|record| self.into_json(record)))
    }

    /// Consumes the record and parse it, producing a JSON serialized record.
    fn into_json(self, record: EvtxRecord) -> Result<SerializedEvtxRecord<String>, EvtxError> {
        let indent = self.settings.should_indent();
        let mut record_with_json_value = EvtxRecord::into_json_value(record)?;

        let data = if indent {
            serde_json::to_string_pretty(&record_with_json_value.data)
                .map_err(SerializationError::from)?
        } else {
            serde_json::to_string(&record_with_json_value.data).map_err(SerializationError::from)?
        };

        Ok(SerializedEvtxRecord {
            event_record_id: record_with_json_value.event_record_id,
            timestamp: record_with_json_value.timestamp,
            data,
        })
    }
}

# in JSON field name prevents import in GCP Bigquery

Thanks for the effort to build this great tool, we're throwing it a forwarded log files and really appreciate the performance boost!

There's one minor step which required preprocessing for our use case, as we are loading data in Google's Bigquery.

I unfortunately don't have a build environment setup for rust atm, but it seems the responsible code is here, impacting both #attributes and #text:

image

value.insert("#attributes".to_owned(), Value::Object(attributes));

object.insert("#text".to_owned(), value.clone().into());

Is there a reason i'm missing to use a special character in these two field names? It's a rather minor issue and we can run sed, but it would save some steps.

Thanks in advance!

Passing a file via stdin?

Is it possible to pass a file via standard input? It looks like there's some seeking going on that would prevent this at the moment. I tried:

evtx_dump -o jsonl /dev/stdin

This prints:

Error: Failed to open evtx file at: /dev/stdin

Caused by:
    0: An error occurred while trying to deserialize evtx stream.
    1: An expected I/O error has occurred
    2: Offset `0x00000000 (0)` - An error has occurred while trying to deserialize binary stream 
       failed to seek in file_header
       
           Original message:
           `Illegal seek (os error 29)`

Missing Records

I was doing some tests between a couple different tools.

With the linked to file down below, this library is missing the following events (tracked via EventRecordID): [14358, 14359, 14360, 14361, 14362, 14363, 14364, 14365, 14366, 14367, 14368, 14369, 14370, 14371, 14372, 14373, 14374, 14375, 14376, 14377, 14378, 14379, 14380, 14381, 14382, 14383, 14384, 14385, 14386, 14387, 14388, 14389, 14390, 14391, 14392, 14393, 14394, 14395, 14396, 14397, 14398, 14399, 14400, 14401, 14402, 14403, 14404, 14405, 14406, 14407, 14408, 14409, 14410, 14411, 14412, 14413, 14414, 14415, 14416, 14417, 14418, 14419, 14420, 14421, 14422, 14423, 14424, 14425, 14426, 14427, 14428, 14429, 14430, 14431, 14432, 14433, 14434, 14435, 14436, 14437, 14438, 14439, 14440, 14441, 14442, 14443, 14444, 14445, 14446, 14447, 14448, 14449, 14450, 14451, 14452, 14453, 14454, 14455, 14456, 14457, 14458, 14459, 14460, 14461, 14462, 14463, 14464, 14465, 14466, 14467, 14468, 14469, 14470, 14471, 14472, 14473, 14474, 14475, 14476, 14477, 14478, 14479, 14480, 14481, 14482, 14483, 14484, 14485, 14486, 14487, 14488, 14489, 14490, 14491, 14492, 14493, 14494, 14495, 14496, 14497, 14498, 14499, 14500, 14501, 14502, 14503, 14504, 14505, 14506, 14507, 14508, 14509, 14510, 14511, 14512, 14513, 14514, 14515, 14516, 14517, 14518, 14519, 14520, 14521, 14522, 14523, 14524, 14525, 14526, 14527, 14528, 14529, 14530, 14531, 14532, 14533, 14534, 14535, 14536, 14537, 14538, 14539, 14540, 14541, 14542, 14543, 14544, 14545, 14546, 14547, 14548, 14549, 14550, 14551, 14552, 14553, 14554, 14555, 14556, 14557, 14558, 14559, 14560, 14561, 14562, 14563, 14564, 14565, 14566, 14567, 14568, 14569, 14570, 14571, 14572, 14573, 14574, 14575, 14576, 14577, 14578, 14579, 14580, 14581, 14582, 14583, 14584, 14585, 14586, 14587, 14588, 14589, 14590, 14591, 14592, 14593, 14594, 14595, 14596, 14597, 14598, 14599, 14600, 14601, 14602, 14603, 14604, 14605, 14606, 14607, 14608, 14609, 14610, 14611, 14612, 14613, 14614, 14615, 14616, 14617, 14618, 14619, 14620, 14621]

It almost looks like there is a block being skipped? Maybe due to a range index?

Test data I was using (link expires 6/1/2019):
https://www.dropbox.com/s/kdy4fxp3ndvq3r9/event_testing.zip?dl=0

The zip has the output from the other tools used for comparison. See below for tool info.

Rust info:
evtx version = "0.1.6"
stable-x86_64-pc-windows-msvc (default)
rustc 1.34.0 (91856ed52 2019-04-10)

Other tools used for comparison:
libevtx - evtxexport.exe [Metz]
https://github.com/libyal/libevtx/releases/tag/20181227

EvtxECmd [Zim]
https://github.com/EricZimmerman/evtx

Invalid behaviour when parsing Evtx from Windows Event Forwarding

Hi,

We are trying to use your library to parse Windows logs but we encountered some strange error when parsing EVTX files coming from a Windows Event Collector server.
In the event viewer, the XML is the following:

- <EventData>
  <Data>Set-Mailbox</Data> 
  <Data>-Identity "Administrateur" -DeliverToMailboxAndForward "False" -ForwardingSmtpAddress "smtp:[email protected]"</Data> 
  <Data>ave.local/Users/Administrateur</Data> 
  <Data>S-1-5-21-186559946-3925841745-111227986-500</Data> 
  <Data>S-1-5-21-186559946-3925841745-111227986-500</Data> 
  <Data>Remote-ManagementShell-Unknown</Data> 
  <Data>5668 w3wp#MSExchangePowerShellAppPool</Data> 
  <Data /> 
  <Data>5</Data> 
  <Data>00:00:26.0389557</Data> 
  <Data>Afficher la forêt entière : 'False', Portée par défaut : « ave.local », Configuration du contrôleur de domaine : « DC.ave.local », Catalogue global préféré : « DC.ave.local », Contrôleurs de domaine préférés : « { DC.ave.local } »</Data> 
  <Data /> 
  <Data /> 
  <Data /> 
  <Data /> 
  <Data /> 
  <Data /> 
  <Data>False</Data> 
  <Data /> 
  <Data>0 objects execution has been proxied to remote server.</Data> 
  <Data /> 
  <Data /> 
  <Data>0</Data> 
  <Data>ActivityId: a3591746-a27b-447a-b8be-ff54ae3a46f1</Data> 
  <Data>ServicePlan:;IsAdmin:True;</Data> 
  <Data /> 
  <Data>fr-FR</Data> 
  </EventData>

If we convert the original EVTX, we obtain the following JSON:

      "Data": {
        "#text": [
          "Set-Mailbox",
          "-Identity \"Administrateur\" -DeliverToMailboxAndForward \"False\" -ForwardingSmtpAddress \"smtp:[email protected]\"",
          "ave.local/Users/Administrateur",
          "S-1-5-21-186559946-3925841745-111227986-500",
          "S-1-5-21-186559946-3925841745-111227986-500",
          "Remote-ManagementShell-Unknown",
          "5668 w3wp#MSExchangePowerShellAppPool",
          "",
          "5",
          "00:00:26.0389557",
          "Afficher la forêt entière : 'False', Portée par défaut : « ave.local », Configuration du contrôleur de domaine : « DC.ave.local », Catalogue global préféré : « DC.ave.local », Contrôleurs de domaine préférés : « { DC.ave.local } »",
          "",
          "",
          "",
          "",
          "",
          "",
          "False",
          "",
          "0 objects execution has been proxied to remote server.",
          "",
          "",
          "0",
          "ActivityId: a3591746-a27b-447a-b8be-ff54ae3a46f1",
          "ServicePlan:;IsAdmin:True;",
          "",
          "fr-FR"
        ]
      
[MSExchange_Management.zip](https://github.com/omerbenamram/evtx/files/7571802/MSExchange_Management.zip)
}
    },

But when the log has been forwarding using WEF, when the EVTX is parsed, we obtain the following JSON:

    "EventData": {
      "Data": {
        "#text": "fr-FR"
      }
    },

As you can see, almost all the information are lost. If you want to make some tests, the EVTX are here: MSExchange_Management.zip

Thank you!

Tests do not run when `evtx` is tested as part of a workspace

The cause is that samples_dir() assumes that the project dir is the current working dir, which is not the case when a workspace is being used.

Solution:

diff --git a/tests/fixtures.rs b/tests/fixtures.rs
index 5ff166e..4d760ce 100644
--- a/tests/fixtures.rs
+++ b/tests/fixtures.rs
@@ -20,11 +20,7 @@ pub fn ensure_env_logger_initialized() {
 }

 pub fn samples_dir() -> PathBuf {
-    PathBuf::from(file!())
-        .parent()
-        .unwrap()
-        .parent()
-        .unwrap()
+    PathBuf::from(env!("CARGO_MANIFEST_DIR"))
         .join("samples")
         .canonicalize()
         .unwrap()

Error while parsing .evtx files with unknown file header flags and chunk flags

Hi,
I tried to parse these files: https://github.com/fox-it/danderspritz-evtx/tree/master/examples, and pre-Security.evtx was parsed successfully while post-Security.evtx wasn't. I got the following error:

Caused by:
    0: An error occurred while trying to deserialize evtx stream.
    1: Unknown EVTX record header flags value: 462880768

I don't know what these flags are, but they can potentially appear in some other evtx-files. Maybe the problem can be solved by replacing from_bits with from_bits_truncate in evtx_file_header.rs and evtx_chunk.rs. For example:

let raw_flags = try_read!(stream, u32, "file_header_flags")?;
      let flags = match HeaderFlags::from_bits(raw_flags) {
          Some(val) => val,
          None => return Err(DeserializationError::UnknownEvtxHeaderFlagValue { value: raw_flags }),
      };

should become

let raw_flags = try_read!(stream, u32, "file_header_flags")?;
let flags = HeaderFlags::from_bits_truncate(raw_flags);

And

let raw_flags = try_read!(input, u32)?;
        let flags = match ChunkFlags::from_bits_truncate(raw_flags) {
            Some(val) => val,
            None => {
                return Err(DeserializationError::UnknownEvtxHeaderFlagValue { value: raw_flags })
            }
        };

should become

let raw_flags = try_read!(input, u32)?;
let flags = ChunkFlags::from_bits_truncate(raw_flags);

or something like that.
Thank you!

Issue installing the evtx library on windows

Hi , Thanks for the library . i have one question regarding installing the library on windows as it shows below error . am using latest python3.9 .

Collecting evtx
Using cached evtx-0.6.8.tar.gz (2.2 kB)
ERROR: Command errored out with exit status 1:
command: 'c:\users\user\appdata\local\programs\python\python39\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\user\AppData\Local\Temp\pip-install-yxibz94l\evtx_04d644cd67554da6b2028e9d4b820743\setup.py'"'"'; file='"'"'C:\Users\user\AppData\Local\Temp\pip-install-yxibz94l\evtx_04d644cd67554da6b2028e9d4b820743\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\user\AppData\Local\Temp\pip-pip-egg-info-cylfeyzm'
cwd: C:\Users\user\AppData\Local\Temp\pip-install-yxibz94l\evtx_04d644cd67554da6b2028e9d4b820743
Complete output (5 lines):
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\user\AppData\Local\Temp\pip-install-yxibz94l\evtx_04d644cd67554da6b2028e9d4b820743\setup.py", line 34, in
RustExtension(
TypeError: init() got an unexpected keyword argument 'target'
----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/33/18/b32715bae61c4fe6a7cdb79aafccb0d4797a1bfef028e9689197af214966/evtx-0.6.8.tar.gz#sha256=414507b79fe997a35fbf05ae57dd2f55a7acfc669b19d9125a894ffe40dbeade (from https://pypi.org/simple/evtx/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Support Chunk Counts > u16

An event log can be so large that it has more chunks than allowable in the header's u16 chunk count. We can calculate the chunk count by taking size of evtx stream, the header size, and chunk size. This allows for parsing chunk where the index is greater than u16. Will submit a PR.

Output date range of evtx

Looking for a way to quickly output the date range of a particular evtx ie:

Oldest log: 2/2/20
Newest log: 3/15/20

Something like what this cmdlet does:

Get-WinEvent -Path 'C:\workspace\Security.evtx' -MaxEvents 1 -oldest | Select-Object -Property TimeCreated

Any ideas?

No empty mappings for miss values

When using separate json attributes, elements that have no value should be left out. Currently, these are empty Maps. For example, in one event you may have a entry that has no value for the RevocationResult element. This looks currently looks like:

"RevocationInfo": {
  "RevocationResult": {},
  "RevocationResult_attributes": {
    "value": "80092013"
  }
}

In another entry, the RevocationResult element has a text value:

"RevocationInfo": {
  "RevocationResult": "The revocation function was unable to check revocation because the revocation server was offline.",
  "RevocationResult_attributes": {
    "value": "80092013"
  }
}

While a value shouldn't be represented as something its not, this also causes errors when doing actions like indexing because of type differences.

Will make a PR to fix.

Missing XML Data

Here is an example of missing data. (See Data tags).

H_Application.evtx.evtx_dump.xml

Record 3308
<?xml version="1.0" encoding="utf-8"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="ESENT">
    </Provider>
    <EventID Qualifiers="0">916</EventID>
    <Level>4</Level>
    <Task>1</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2018-08-09 07:21:00.046087 UTC">
    </TimeCreated>
    <EventRecordID>3308</EventRecordID>
    <Channel>Application</Channel>
    <Computer>DESKTOP-1N4R894</Computer>
    <Security>
    </Security>
  </System>
  <EventData>
    <Data></Data>
    <Binary></Binary>
  </EventData>
</Event>
Record 3309
<?xml version="1.0" encoding="utf-8"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="ESENT">
    </Provider>
    <EventID Qualifiers="0">916</EventID>
    <Level>4</Level>
    <Task>1</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2018-08-09 08:22:00.061763 UTC">
    </TimeCreated>
    <EventRecordID>3309</EventRecordID>
    <Channel>Application</Channel>
    <Computer>DESKTOP-1N4R894</Computer>
    <Security>
    </Security>
  </System>
  <EventData>
    <Data></Data>
    <Binary></Binary>
  </EventData>
</Event>

Compared to H_Application.evtx.evtxecmd.xml

<?xml version="1.0" encoding="utf-16"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="ESENT" />
    <EventID Qualifiers="0">916</EventID>
    <Level>4</Level>
    <Task>1</Task>
    <Keywords>EventlogClassic</Keywords>
    <TimeCreated SystemTime="2018-08-09 07:21:00.0460872" />
    <EventRecordID>3308</EventRecordID>
    <Channel>Application</Channel>
    <Computer>DESKTOP-1N4R894</Computer>
    <Security />
  </System>
  <EventData>
    <Data>svchost, 2672,G,98, EseDiskFlushConsistency, ESENT, 0x800000</Data>
    <Binary></Binary>
  </EventData>
</Event>
<?xml version="1.0" encoding="utf-16"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="ESENT" />
    <EventID Qualifiers="0">916</EventID>
    <Level>4</Level>
    <Task>1</Task>
    <Keywords>EventlogClassic</Keywords>
    <TimeCreated SystemTime="2018-08-09 08:22:00.0617638" />
    <EventRecordID>3309</EventRecordID>
    <Channel>Application</Channel>
    <Computer>DESKTOP-1N4R894</Computer>
    <Security />
  </System>
  <EventData>
    <Data>svchost, 2672,G,98, EseDiskFlushConsistency, ESENT, 0x800000</Data>
    <Binary></Binary>
  </EventData>
</Event>

You can find the RAW evtxs here:
https://www.dropbox.com/s/0vejq9lsjq1cskq/DEFCON_2018_DESKTOP_KAPE_EVTX_SET.zip?dl=0

You can find the output reports here:
https://www.dropbox.com/s/emx7lbkmq6xrwuc/DEFCON_2018_DESKTOP_EVTX_COMPARISON.zip?dl=0

In this example the file that contains the data depicted here can be found in the DEFCON_2018_DESKTOP_KAPE_EVTX_SET.zip set [\H\Windows\system32\winevt\logs\Application.evtx]

Feature Request: JSONL Output

Can I request jsonl output formatting? JSON is nice, but in its current form it does not ingest easily. Current JSON output example:

Record 1
{
  "Event": {
    "#attributes": {
      "xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
    },
    "EventData": {
      "Data": "caller=sppsvc.exe"
    },
    "System": {
      "Channel": "Microsoft-Client-Licensing-Platform/Admin",
      "Computer": "DESKTOP-1N4R894",
      "Correlation": null,
      "EventID": 100,
      "EventRecordID": 1,
      "Execution": {
        "#attributes": {
          "ProcessID": 1348,
          "ThreadID": 1368
        }
      },
      "Keywords": "0x2000000000000001",
      "Level": 4,
      "Opcode": 0,
      "Provider": {
        "#attributes": {
          "Guid": "B6CC0D55-9ECC-49A8-B929-2B9022426F2A",
          "Name": "Microsoft-Client-Licensing-Platform"
        }
      },
      "Security": {
        "#attributes": {
          "UserID": "S-1-5-18"
        }
      },
      "Task": 0,
      "TimeCreated": {
        "#attributes": {
          "SystemTime": "2018-07-06T18:38:20.815807Z"
        }
      },
      "Version": 0
    }
  }
}
Record 2
{
  "Event": {
    "#attributes": {
      "xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
    },
    "EventData": {
      "Data": "10.0.17134.1"
    },
    "System": {
      "Channel": "Microsoft-Client-Licensing-Platform/Admin",
      "Computer": "DESKTOP-1N4R894",
      "Correlation": null,
      "EventID": 101,
      "EventRecordID": 2,
      "Execution": {
        "#attributes": {
          "ProcessID": 1348,
          "ThreadID": 1372
        }
      },
      "Keywords": "0x2000000000000001",
      "Level": 4,
      "Opcode": 0,
      "Provider": {
        "#attributes": {
          "Guid": "B6CC0D55-9ECC-49A8-B929-2B9022426F2A",
          "Name": "Microsoft-Client-Licensing-Platform"
        }
      },
      "Security": {
        "#attributes": {
          "UserID": "S-1-5-18"
        }
      },
      "Task": 0,
      "TimeCreated": {
        "#attributes": {
          "SystemTime": "2018-07-06T18:38:20.819393Z"
        }
      },
      "Version": 0
    }
  }
}

Proposed JSONL output:

JSONL is easier to work with because each line is the record. Example:

{"Event":{"#attributes":{"xmlns":"http://schemas.microsoft.com/win/2004/08/events/event"},"EventData":{"Data":"caller=sppsvc.exe"},"System":{"Channel":"Microsoft-Client-Licensing-Platform/Admin","Computer":"DESKTOP-1N4R894","Correlation":null,"EventID":100,"EventRecordID":1,"Execution":{"#attributes":{"ProcessID":1348,"ThreadID":1368}},"Keywords":"0x2000000000000001","Level":4,"Opcode":0,"Provider":{"#attributes":{"Guid":"B6CC0D55-9ECC-49A8-B929-2B9022426F2A","Name":"Microsoft-Client-Licensing-Platform"}},"Security":{"#attributes":{"UserID":"S-1-5-18"}},"Task":0,"TimeCreated":{"#attributes":{"SystemTime":"2018-07-06T18:38:20.815807Z"}},"Version":0}}}
{"Event":{"#attributes":{"xmlns":"http://schemas.microsoft.com/win/2004/08/events/event"},"EventData":{"Data":"10.0.17134.1"},"System":{"Channel":"Microsoft-Client-Licensing-Platform/Admin","Computer":"DESKTOP-1N4R894","Correlation":null,"EventID":101,"EventRecordID":2,"Execution":{"#attributes":{"ProcessID":1348,"ThreadID":1372}},"Keywords":"0x2000000000000001","Level":4,"Opcode":0,"Provider":{"#attributes":{"Guid":"B6CC0D55-9ECC-49A8-B929-2B9022426F2A","Name":"Microsoft-Client-Licensing-Platform"}},"Security":{"#attributes":{"UserID":"S-1-5-18"}},"Task":0,"TimeCreated":{"#attributes":{"SystemTime":"2018-07-06T18:38:20.819393Z"}},"Version":0}}}

InstanceID missing from logs

Evtx'es have a property "InstanceID" which is related to EventID:

InstanceID is not EventID, but can be:

The InstanceId property uniquely identifies an event entry for a configured event source. The InstanceId for an event log entry represents the full 32-bit resource identifier for the event in the message resource file for the event source. The EventID property equals the InstanceId with the top two bits masked off. Two event log entries from the same source can have matching EventID values, but have different InstanceId values due to differences in the top two bits of the resource identifier. If the application wrote the event entry using one of the WriteEntry methods, the InstanceId property matches the optional eventId parameter. If the application wrote the event using WriteEvent, the InstanceId property matches the resource identifier specified in the InstanceId of the instance parameter. If the application wrote the event using the Win32 API ReportEvent, the InstanceId property matches the resource identifier specified in the dwEventID parameter.

Taken from here: https://evotec.xyz/powershell-everything-you-wanted-to-know-about-event-logs/

I would very much like to have InstanceID read in. It isn't in the XML data; XML data contains EventID

I don't know enough about evtx structure to offer a patch.

Cross post with pyevtx-rs/issues/9

is_a_non_negative_number error

Getting an error on this file.

Here is the link: https://www.dropbox.com/s/1tugvc0gy0icv59/VSS1_Windows_system32_winevt_logs_HardwareEvents.evtx?dl=0

D:\Tools\evtx_dump>evtx_dump.exe D:\Images\CTF_DEFCON_2018\Image3-Desktop\Extracts\EVTX\VSS1_Windows_system32_winevt_logs_HardwareEvents.evtx
thread 'main' panicked at 'Failed to load evtx file located at D:\Images\CTF_DEFCON_2018\Image3-Desktop\Extracts\EVTX\VSS1_Windows_system32_winevt_logs_HardwareEvents.evtx', src\bin\evtx_dump.rs:201:29
stack backtrace:
   0: std::sys::windows::backtrace::set_frames
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\sys\windows\backtrace\mod.rs:94
   1: std::sys::windows::backtrace::unwind_backtrace
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\sys\windows\backtrace\mod.rs:81
   2: std::sys_common::backtrace::_print
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\sys_common\backtrace.rs:70
   3: std::sys_common::backtrace::print
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\sys_common\backtrace.rs:58
   4: std::panicking::default_hook::{{closure}}
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:200
   5: std::panicking::default_hook
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:215
   6: std::panicking::rust_panic_with_hook
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:478
   7: std::panicking::continue_panic_fmt
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:385
   8: std::panicking::begin_panic_fmt
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:340
   9: evtx_dump::is_a_non_negative_number
  10: <evtx::xml_output::XmlOutput<W> as evtx::xml_output::BinXmlOutput<W>>::visit_open_start_element
  11: std::rt::lang_start_internal::{{closure}}
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\rt.rs:49
  12: std::panicking::try::do_call<closure,i32>
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:297
  13: panic_unwind::__rust_maybe_catch_panic
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libpanic_unwind\lib.rs:87
  14: std::panicking::try
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:276
  15: std::panic::catch_unwind
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panic.rs:388
  16: std::rt::lang_start_internal
             at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\rt.rs:48
  17: main
  18: invoke_main
             at d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
  19: __scrt_common_main_seh
             at d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
  20: BaseThreadInitThunk
  21: RtlUserThreadStart

Problems parsing evtx files originating from NetApp

I get the error :

Failed to dump the next record.
Caused by:
0: Failed to parse record number 341
1: An error occurred while trying to serialize binary xml to output.
2: Building a JSON document failed with message: This is a bug - expected current value to exist, and to be an object type.
Check that the value is not Value::null

Unfortunately I have not additional Info to provide from the output, and it seems to fail on all records.

[Feature Request] Lowercase keys in JSON Output

Hi,

I recently came across your evtx parser and was really impressed by it's speed. Thank you for your efforts.

In one of my use cases I would like to import the resulting json files to elasticsearch via logstash to work with some logstash filter to make them "ECS" (Elastic Common naming scheme) compliant. One of their rules are lowercase field names. Logstash has a nice json parser but it's not the best point to lowercase all potential keys in a json structure.

Therefor I would like to ask, if there is a chance to get another cli argument a la "lowercase all json keys"?

Feature Request: RunAsService or residential option

Would you consider implementing a constant log monitoring option "-d --run-as-service"?

The idea is to monitor a single evtx log for changes and feed them to STDOUT a or a xml/json file so the new changes can be streamed to another host for processing.

The way it works now when it finishes processing the evtx log file evtx_dump exits.

Awesome work by the way! Thank you!

5111875 is an unknown value for bool, coercing to `true`

Those (or a similar) messages are created when evtx reads a boolean value (type code 0x0d with a length of 4 which has a value different from 0x00 or 0x01. According to Microsofts definition, a BoolType is An 8-bit integer that MUST be 0x00 or 0x01 (mapping to true or false, respectively). (https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-even6/8aa98312-f199-4e37-a51f-d3a2ccb50d60)

There seems to be a bug somewhere either in the creator of evtx files or in the parser.

Microsoft defines the following (https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-even6/c73573ae-1c90-43a2-a65f-ad7501155956):

TemplateInstanceData = ValueSpec *Value; Emit using TemplateInstanceDataRule
ValueSpec = NumValues *ValueSpecEntry
ValueSpecEntry = ValueByteLength ValueType %x00
ValueByteLength = WORD
ValueType = 
  NullType / StringType / AnsiStringType / Int8Type / UInt8Type / 
  Int16Type / UInt16Type / Int32Type / UInt32Type / Int64Type / 
  Int64Type / Real32Type / Real64Type / BoolType / BinaryType / 
  GuidType / SizeTType / FileTimeType / SysTimeType / SidType / 
  HexInt32Type / HexInt64Type / BinXmlType / StringArrayType / 
  AnsiStringArrayType / Int8ArrayType / UInt8ArrayType / 
  Int16ArrayType / UInt16ArrayType / Int32ArrayType / UInt32ArrayType/
  Int64ArrayType / UInt64ArrayType / Real32ArrayType / 
  Real64ArrayType / BoolArrayType / GuidArrayType / SizeTArrayType / 
  FileTimeArrayType / SysTimeArrayType / SidArrayType / 
  HexInt32ArrayType / HexInt64ArrayType
BoolType = %x0D

Value = 
  StringValue / AnsiStringValue / Int8Value / UInt8Value / 
  Int16Value / UInt16Value / Int32Value / UInt32Value / Int64Value /
  UInt64Value / Real32Value / Real64Value / BoolValue / BinaryValue / 
  GuidValue / SizeTValue / FileTimeValue / SysTimeValue / SidValue /
  HexInt32Value / HexInt64Value / BinXmlValue / StringArrayValue / 
  AnsiStringArrayValue / Int8ArrayValue / UInt8ArrayValue / 
  Int16ArrayValue / UInt16ArrayValue / Int32ArrayValue / 
  UInt32ArrayValue / Int64ArrayValue / UInt64ArrayValue / 
  Real32ArrayValue / Real64ArrayValue / BoolArrayValue / 
  GuidArrayValue / SizeTArrayValue / FileTimeArrayValue / 
  SysTimeArrayValue / SidArrayValue / HexInt32ArrayValue / 
  HexInt64ArrayValue

So, a boolean should could like the following:

0x00000001 0x01 0x0d 0x00 0x00
    |        |    |    |    |
    |        |    |    |    +-> Value
    |        |    |    +------> %x00
    |        |    +-----------> ValueType
    |        +----------------> ValueByteLength
    +-------------------------> NumValues

But obviously, there are (sometimes) BoolTypes with a ValueByteLength of 4, which violate the specification.
You've added a special handling for boolean values which do not match 0x00 or 0x01. Do you know why there are such values?

I'm not sure if this is really a bug of your code, but reading 4 Byte for a boolean value also violates the specification and I was interested in what the reason for this is.

advice for reading live events

Hi, I am new to rust and wonder if you have any examples for reading windows event logs on a live system. And of course thanks for making this fast library!

Compilation fails with "error[E0603]: module `export` is private"

It seems that there is some interference with serde-1.0.123

Environment

$ cargo --version
cargo 1.49.0 (d00d64df9 2020-12-05)

$ uname -v
Darwin Kernel Version 19.6.0: Tue Nov 10 00:10:30 PST 2020; root:xnu-6153.141.10~1/RELEASE_X86_64

Command

cargo install evtx

Error Message

   Compiling evtx v0.6.8
error[E0603]: module `export` is private
   --> /Users/jasa/.cargo/registry/src/github.com-1ecc6299db9ec823/evtx-0.6.8/src/binxml/name.rs:11:12
    |
11  | use serde::export::Formatter;
    |            ^^^^^^ private module
    |
note: the module `export` is defined here
   --> /Users/jasa/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.123/src/lib.rs:275:5
    |
275 | use self::__private as export;
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^

error[E0603]: module `export` is private
   --> /Users/jasa/.cargo/registry/src/github.com-1ecc6299db9ec823/evtx-0.6.8/src/model/deserialized.rs:5:12
    |
5   | use serde::export::Formatter;
    |            ^^^^^^ private module
    |
note: the module `export` is defined here
   --> /Users/jasa/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.123/src/lib.rs:275:5
    |
275 | use self::__private as export;
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to 2 previous errors

For more information about this error, try `rustc --explain E0603`.
error: failed to compile `evtx v0.6.8`, intermediate artifacts can be found at `/var/folders/0d/tjq7d2vn3nl19k00k4_gpzyr1q_6wx/T/cargo-installdk40ap`

Caused by:
  could not compile `evtx`

To learn more, run the command again with --verbose.

macos 0.7.2

Just noticed there was no darwin version for 0.7.2. Figured I'd report it.

error on evtx files for header and hexdump

seeing this error for evtx files? Not sure what is causing this though, is there any evtx logs that can't be handled by this rust binary?

Failed to dump the next record.

Caused by:
0: Failed to parse chunk number 0
1: Failed to parse chunk header
2: Failed to deserialize next_template_offset of type u32
3: Offset 0x08180000 (135790592) - An error has occurred while trying to deserialize binary stream

       Original message:
       `failed to fill whole buffer`
   
   Hexdump:
       
   
   ---------------------------------------------------------------------------
   Current Value 00
                 --
   
   00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   00000060: 00 00 00 00                                      ....
   ----------------------------------------------------------------------------
   
4: failed to fill whole buffer

Failed to dump the next record.

Caused by:
0: Failed to parse chunk number 7
1: Failed to parse chunk header
2: Invalid EVTX chunk header magic, expected ElfChnk0, found [ 0, 0, 1B, 5, 0, 0, 2, E]
Failed to dump the next record.

Caused by:
0: Failed to parse chunk number 8
1: Failed to parse chunk header
2: Invalid EVTX chunk header magic, expected ElfChnk0, found [8A, 14, B3, D8, 1, F, 1, 1]
Failed to dump the next record.

RecordId should be public

tl;dr make type RecordId public so it can be used

I'd like to store the RecordId instance for later processing. While I know the RecordId is a u64, it would nice if I could simply refer to RecordId.

As in this contrived example.

use ::evtx::EvtxParser;
use ::evtx::RecordId;

fn main() {
    let fp = PathBuf::new();
    let mut parser = EvtxParser::from_path(fp).unwrap();
    let mut ids: Vec<RecordId> = vec![];
    for record in parser.records() {
        match record {
            Ok(r) => {
                ids.push(r.event_record_id);
            },
            _ => {},
        }
    }
}

Currently, that code does not compile

error[E0432]: unresolved import `evtx::RecordId`
 --> src\main.rs:2:5
  |
2 | use ::evtx::RecordId;
  |     ^^^^^^^^^^^^^^^^ no `RecordId` in the root

Using evtx version 0.8.1.

Custom Output Question

I am wanting to create a custom output. Something very similar to evtx::json_output::JsonOutput, but I want to be able to tweak how the json is generated just a little bit to get around some hurdles of ingesting the json into Elastic. Is it possible in 0.4.1 to create a custom output structure that implements BinXmlOutput? My issue is that some of the required modules are not public (for example evtx::model::xml::XmlElement).

Thanks in advance for the help.

pip3 install evtx lead to "ERROR: Command errored out with exit status 1:"

Getting this error about evtx:

Failed to build evtx
Installing collected packages: evtx
Running setup.py install for evtx ... error
ERROR: Command errored out with exit status 1:
command: /home/template/LogonTracer/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-k6d_zuyt/evtx_022aa9be838d483e91b20221f327d5e8/setup.py'"'"'; file='"'"'/tmp/pip-install-k6d_zuyt/evtx_022aa9be838d483e91b20221f327d5e8/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-naa187u2/install-record.txt --single-version-externally-managed --compile --install-headers /home/template/LogonTracer/include/site/python3.9/evtx
cwd: /tmp/pip-install-k6d_zuyt/evtx_022aa9be838d483e91b20221f327d5e8/
Complete output (44 lines):
running install
running build
running build_ext
running build_rust
error: manifest path Cargo.toml does not exist
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-k6d_zuyt/evtx_022aa9be838d483e91b20221f327d5e8/setup.py", line 21, in
setup(
File "/home/template/LogonTracer/lib/python3.9/site-packages/setuptools/init.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/usr/lib/python3.9/distutils/core.py", line 148, in setup
dist.run_commands()
File "/usr/lib/python3.9/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/usr/lib/python3.9/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/template/LogonTracer/lib/python3.9/site-packages/setuptools/command/install.py", line 61, in run
return orig.install.run(self)
File "/usr/lib/python3.9/distutils/command/install.py", line 590, in run
self.run_command('build')
File "/usr/lib/python3.9/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/lib/python3.9/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3.9/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/usr/lib/python3.9/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/lib/python3.9/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/template/LogonTracer/lib/python3.9/site-packages/setuptools_rust/setuptools_ext.py", line 103, in run
build_rust.run()
File "/home/template/LogonTracer/lib/python3.9/site-packages/setuptools_rust/command.py", line 52, in run
self.run_for_extension(ext)
File "/home/template/LogonTracer/lib/python3.9/site-packages/setuptools_rust/build.py", line 92, in run_for_extension
dylib_paths = self.build_extension(ext)
File "/home/template/LogonTracer/lib/python3.9/site-packages/setuptools_rust/build.py", line 131, in build_extension
metadata = json.loads(check_output(metadata_command))
File "/usr/lib/python3.9/subprocess.py", line 424, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/usr/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['cargo', 'metadata', '--manifest-path', 'Cargo.toml', '--format-version', '1']' returned non-zero exit status 101.
----------------------------------------
ERROR: Command errored out with exit status 1: /home/template/LogonTracer/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-k6d_zuyt/evtx_022aa9be838d483e91b20221f327d5e8/setup.py'"'"'; file='"'"'/tmp/pip-install-k6d_zuyt/evtx_022aa9be838d483e91b20221f327d5e8/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-naa187u2/install-record.txt --single-version-externally-managed --compile --install-headers /home/template/LogonTracer/include/site/python3.9/evtx Check the logs for full command output.

Using VM with
Distributor ID: Ubuntu
Description: Ubuntu 21.04
Release: 21.04
Codename: hirsute
❯ python3 -V
Python 3.9.5
❯ python -V
Python 3.9.5
❯ pip -V
pip 21.1.2 from /home/template/LogonTracer/lib/python3.9/site-packages/pip (python 3.9)
❯ pip3 -V
pip 21.1.2 from /home/template/LogonTracer/lib/python3.9/site-packages/pip (python 3.9)
❯ rustc -V
rustc 1.52.1 (9bc8c42bb 2021-05-09)

pip3 install evtx just fail but pip3 install python-evtx is working fine.

pip3 install python-evtx
Requirement already satisfied: python-evtx in ./lib/python3.9/site-packages (0.7.4)
Requirement already satisfied: pyparsing==2.4.7 in ./lib/python3.9/site-packages (from python-evtx) (2.4.7)
Requirement already satisfied: hexdump==3.3 in ./lib/python3.9/site-packages (from python-evtx) (3.3)
Requirement already satisfied: configparser==4.0.2 in ./lib/python3.9/site-packages (from python-evtx) (4.0.2)
Requirement already satisfied: more-itertools==5.0.0 in ./lib/python3.9/site-packages (from python-evtx) (5.0.0)
Requirement already satisfied: zipp==1.0.0 in ./lib/python3.9/site-packages (from python-evtx) (1.0.0)
Requirement already satisfied: six in ./lib/python3.9/site-packages (from python-evtx) (1.16.0)

even if installed evtx:
evtx_dump -h
EVTX Parser 0.7.2
Omer B. [email protected]
Utility to parse EVTX files

USAGE:
evtx_dump [FLAGS] [OPTIONS]

FLAGS:
--no-confirm-overwrite When set, will not ask for confirmation before overwriting files, useful for

pip install just fail

anything i miss or do wrong ?
thanks

Proposal to support jsonl (json line)

Proposal to support jsonl https://jsonlines.org/ as output format.
jsonl are json dicts/types seperated by a newline.

{"event": "foo"}
{"event": "baa"}
...

This makes it extremely easy to use the output with every language that support iterating through lines,
and parsing json. (and its would also be grepable)

Command line flag to skip printing "#attributes" while taking output as JSON

The JSON output contains "#attributes" which alters the true nature of the log and makes querying data a challenge.

The introduction of a simple command line flag that skips printing the "#attributes" text and prints even attributes as simple parent-child will make life easy for anybody who has to load and query the output of this project.

JSON formed by parsing EVTX using rust_evtx:

{
  "Event": {
    "#attributes": {
      "xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
    }
    .
    .
  }
}

Desired JSON:

{
  "Event": {
    {
      "xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
    }
    .
    .
  }
}

Thank you for considering my sincere request.

Feature Request: XML String to JSON

How hard would it be to implement the JSON parsing from a XML string? This would help with testing where you want to validate the output json structure against specific XML events. Could also be useful for something that uses Windows API that retrieves the XML string and wants to convert it to JSON and maintain a 1 to 1 structure with this library.

Parser fails if last_event_record_id and free_space_offset are set wrong in the Chunk Header

While trying to import Sysmon Event Logs provided by SANS in the Workshop "Cobalt Strike Detection with Event Log Analysis" (see https://www.sans.org/webcasts/tech-tuesday-workshop-cobalt-strike-detection-log-analysis-119395/) to Kuiper, I faced the following parsing error:

Failed 1: Invalid EVTX record header magic, expected `2a2a0000`, found `[ 0, 0, 0, 0]` - Line No. 21

I was able to reproduce the issue with another Sysmon Event Log file and found out that if the chunk header fields last_event_record_id as well as free_space_offset are greater than the actual number of records in the chunk, the parser fails with the aforementioned error. A sample output of parsing the Sysmon Event Log file provided by SANS in debug mode is shown below. Please note, that I added the output of the chunk header fields for debugging purposes.

14:38:58 [INFO] first_event_record_number - 188705
14:38:58 [INFO] last_event_record_number - 188775
14:38:58 [INFO] first_event_record_id - 188705
14:38:58 [INFO] last_event_record_id - 188775
14:38:58 [INFO] free_space_offset - 64568
14:38:58 [INFO] Initializing string cache
14:38:58 [INFO] Initializing template cache
14:38:58 [INFO] Record id - 188705
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 3000, event_record_id: 188705, timestamp: 2018-09-07T04:28:25.337132Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
14:38:58 [INFO] Record id - 188706
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 776, event_record_id: 188706, timestamp: 2018-09-07T04:29:02.596583Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
14:38:58 [INFO] Record id - 188707
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 744, event_record_id: 188707, timestamp: 2018-09-07T04:29:40.365998Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
14:38:58 [INFO] Record id - 188708
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 744, event_record_id: 188708, timestamp: 2018-09-07T04:30:58.380798Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
14:38:58 [INFO] Record id - 188709
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 744, event_record_id: 188709, timestamp: 2018-09-07T04:30:58.380798Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
14:38:58 [INFO] Record id - 188710
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 744, event_record_id: 188710, timestamp: 2018-09-07T04:30:58.380798Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
14:38:58 [INFO] Record id - 188711
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 744, event_record_id: 188711, timestamp: 2018-09-07T04:32:12.405200Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
Failed to dump the next record.

Caused by:
    0: An error occurred while trying to deserialize evtx stream.
    1: Invalid EVTX record header magic, expected `2a2a0000`, found `[ 0,  0,  0,  0]`

Within the source file evtx_chunk.rs this should be the code lines of interest.

evtx/src/evtx_chunk.rs

Lines 250 to 258 in 0950198

let record_header = match EvtxRecordHeader::from_reader(&mut cursor) {
Ok(record_header) => record_header,
Err(err) => {
// We currently do not try to recover after an invalid record.
self.exhausted = true;
return Some(Err(EvtxError::DeserializationError(err)));
}
};

From my point of view, the parser should not completely fail if chunk header fields are not set correctly. Instead, the parser should continue at least with the next chunk after an errorneous record could not be parsed.

Nevertheless, thank you for your excellent work and for providing this Event Log parser!

Multi-threading not enabled by default when using the library

Hello,

I try to use the library mode of evtx, because I need to script additional things (I want to send it to a Splunk instance). Maybe I missed something but the multithreading does not seem to be enabled. See the two following pictures showing a test with a 20Mo Application evtx file, with the binary, and with the library:

Binary mode:
evtx_binary

Library mode:
evtx_library

Any idea ? Maybe I should add additional things on the Cargo configuration file.

Observe that for the library mode, the additional threads are created but not used.
Best regards,
ekt0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.