ククログ

株式会社クリアコード > ククログ > Fluentd meetup 2019でWindows EventLogに関するプラグイン回りの発表した話

Fluentd meetup 2019でWindows EventLogに関するプラグイン回りの発表した話

先日OSS Summitの共催イベントのFluentd meetupでWindows EventLogに関するプラグイン回りの発表をしてきた畑ケです。

関連リンク:

経緯

Windows EventLogを取得したい要望がありました。 検討した結果、例えば以下の問題があることが分かりました。

  • fluent-plugin-windows-eventlogに含まれているin_windows_eventlogが時折SEGVする問題

  • WindowsのANSIコードページを超える文字セットを扱う時に文字化けしてしまう問題

これらの問題を解決するのにはWindows Vistaから導入された新しいAPIを元にしたライブラリを作成し、それを使用するプラグインの作成が必要となりました。 今回のスライドはこれらの問題の解決策を軽く導入として話し、実際に作成したプラグインがこれらの問題を解決しているかどうかを実例を交えつつ解説しました。

イベントの内容

共催イベントということで、会場設備はとても良かったです。 Fluentdはこれから先、より動的なサービスオペレーションにより親和性の高い改良がなされていくんだろうなぁと感じました。

また、筆者は登壇は何回かしたことがあるのですが、初めて英語で発表するということもあり若干早口になっていたかもしれません。

使用した英語スクリプト

数字やバージョン番号などは読み方を迷ってしまうと考えたため、一部数字やバージョン番号などを読み間違えないようにアルファベット表記にしてあったり、括弧付きの注記にしてあります。

My name is Hiroshi Hatake. 
Working at ClearCode Inc(operated) as a software developer.
Now, OK. Today, I'm gonna talk about "Fluentd meets Unicode Windows EventLog".

So, here is today’s my talk agenda.
First, I'll talk about motivation.
Then, I'll talk about winevt_c which is the newly created gem shortly, Unicode character handling, using ANSI code page issues, Unicode testing, benchmark, 
throughput benchmark, and conclusion.

Now, let's start to talk about motivation.

in_windows_eventlog which is the old plugin has some issues.
1st. Unicode character handling. Sometimes garbage characters are generated.
2nd. Memory consumption in flood of windows event
3rd. Sometimes it causes segmentation fault
4th. CPU spike when resuming operation
5th. At least one event should exist in the listening channel on start to listen. Otherwise, nothing to be read(red).

They are caused by dependent gem which is named win thirty two-eventlog.

Next, I'm talking about winevt_c.
This gem is a solution for Unicode handling issue, but this issue is very complicated. I’ll describe it later.

Then, I show you some winevt_c itself code examples.
First, this code is querying for specified channel.
In this case, specified channel is application.

Second, displayed example code is updating bookmark for querying channel.
Bookmark is useful to resume operation.

Third, the example code subscribes specified channel.
Subscribe API is battery included. 
This API is useful for most users.

The new gem which is named winevt_c solves win thirty-two-eventlog issues

1st. Improve Unicode character handling
2nd. It doesn’t cause segmentation fault on the same situation
3rd. CPU spike when resuming operation is declined
4th. Reduce Memory consumption in flood of windows event. This issue is partially solved but declined consuming memory.
5th. At least one event should exist in the listening channel on starting to listen. Empty channel can also subscribe. The older one will be staled.

1st Issue is the hardest problem and the others are solved by new API to read Windows EventLog.

The relationship of plugins and gems in this talk.
The old plugin which is named in_windows_eventlog depends on win32-eventlog.

And

The new plugin which is named in_windows_eventlog2 depends on winevt_c gem which is the newly created one.

Next, Unicode character handling.
This part and the next part treat the hardest issues in this talk.

Then, we should consider what Unicode means.

WHAT IS UNICODE?

In Windows context, it means UTF-sixteen.

Ruby also can handle UTF-sixteen in Ruby code world, although we'll consider Ruby C extension.
Ruby C extension API does not provide convenient API to handle UTF-sixteen. 
Thus, we should say: In Ruby C extension context, it means UTF-8.

What is the difference between ANSI and Unicode?

In Windows, ANSI means current code page.
In Japanese edition Windows, it's code page nine-hundred thirty-two.
Large A suffixed API uses ANSI character encoding.

In Windows, Unicode means UTF-sixteen.
Large W suffixed API uses UTF-sixteen character encoding.
PWSTR and such Large W contained type API arguments also use UTF-sixteen character encoding.

So, we need to convert from UTF-sixteen to UTF-8 as described before.

The win thirty-two eventlog uses ANSI version of Windows API.
To handle Unicode characters correctly in win thirty-two eventlog, we need to use Unicode version of them.
 
In addition, win thirty-two eventlog gem development is inactive in recent days.
And Unicode version patch exists, but it have not been merged in.

Then, next topic is Using ANSI code page issues.

ANSI code page is lightweight encoding way for specific characters, but it causes encoding issue in some cases. 

In Japanese Edition Windows, it uses code page nine-hundred thirty-two by default.
It can handle
* Alphabets
* Greek letters
* Cyrillic alphabets
* Hiragana and Katakana
* JIS level one and two Kanji Sets

Other characters cannot handle with code page nine-hundred thirty-two.

UTF-8 contains more characters!
UTF-8 encoding can handle code page nine-hundred thirty-two characters and,
Diacritical mark such as umlaut in German
Hebrew, Arabic, Devanagari. Devanagari is the letters for Hindi.
South east asia characters such as Thai, Laotian... etsetra.
And Emoji characters.

So, we reached ANSI code page issues solution.

We decide to develop the brand-new gem which is named winevt_c.
1. It uses new Windows API that is defined in <winevt.h>
2. The new API provides bookmark which is used to resume operation
3. Unicode native API

This gem's detail was described in about winevt_c section.

This gem is written in C and C plus plus.
So, users need to build C and C plus plus extension code by themselves.
Current RubyInstaller bundles MSYS2 system. We needn't  worry about it more than necessary.

The next topic is Unicode testing.

In this section, I'm talking about Unicode testing.
This test environment is:

* Japanese edition Windows ten home nineteen-ow-three six-four bits
* Prepared writing Windows EventLog tool which is written in C-sharp
* And I used PowerShell Core six on Windows Terminal. Not command prompt and raw Powershell terminal.
'Cause command prompt and raw PowerShell terminal cannot display Thai, Devanagari, Emoji and so on. 

Here is the snippet of writing Windows EventLog code that contains...
Alphabets
Non-ASCII symbols
Japanese
Thai
Cyrillic
Greek letters
Arabic alphabets
Devanagari
Unicod-ish Kaomoji
And Emoji

Unicode testing uses benchmarking tool which is created for heavily loaded benchmark, but it accepts parameters for lightweight writing.
The shown command-line means ten events written into benchmark channel.
And sleep ten milliseconds before it displays flow rate.

This slide and the next slide describe configuration for Unicode testing.
Fluentd will use in_windows_eventlog as a source and out_forward as an output.
Note that the old plugin requests to use from_encoding and encoding parameters to handle character encoding correctly.
And also default read_interval parameter which is two-seconds is used.
But still unhandled characters exist.

Fluentd will use in_windows_eventlog2 as a source and out_forward as an output.
Note that the new plugin doesn't request to use from_encoding and encoding parameters.
And also default read_interval parameter which is two-seconds is used.
The new plugin always handles character encoding as UTF-8.
So, they needn't anymore!

Then, let's see the old plugin execution log.
Symbol, Thai, Arabic, Devanagari, Unicode contains Kaomoji, and Emoji are displayed broken. 

The other result is the new plugin execution log.
It displays Symbol, Thai, Devanagari, Unicode contains Kaomoji, and Emoji are correctly displayed.
Arabic is slightly broken but it is Windows Terminal's displaying issue.

Now, we can handle Emoji characters contaminated Windows EventLog.

So, we reached benchmark section.

In this section, I'm talking about Benchmark.

Benchmark environment has two nodes:
The collector node is Windows ten eighteen-ow-nine and has two of virtual CPUs, four gigabytes memory and standard SSD.
The aggregator node is Ubuntu eighteen point zero-four and has two virtual CPUs, four gigabytes memory and standard SSD.

The collector Fluentd process and writing EventLog tool are running on the collector node.
And the aggregator Fluentd process is running on the aggregator node.

The collector Fluentd sends events with forward protocol and the aggregator Fluentd receives with the same protocol.

1 million events will write into collector node by benchmarking tool.
Its flow rate is about ninety-one per seconds.

The shown command-line generates events and writing them into the benchmark channel.

The left side configuration is for the collector node.
In_windows_eventlog is used as a source and out_forward is used as an output.
The right side configuration is for the aggregator node.
It receives events from the collector Fluentd with forward protocol. Output is out_null or out_stdout.

The left side configuration is for the collector node.
In_windows_eventlog2 is used as a source and out_forward is used as an output.
The right side configuration is for the aggregator node.
It also receives events from collector Fluentd with forward protocol. Output is out_null or out_stdout.

Now, let's see the benchmarking results.
First, here is the old plugin benchmarking result.
This graph shows that the old plugin uses around between one-hundred ten and two-hundred forty megabytes memory.
And CPU usage sometimes spikes but around ten percent is used in the entire node. 

Second, here is the new plugin benchmarking result.
This graph shows that the new plugin uses around one-hundred twenty megabytes memory.
But CPU usage slightly higher than the previous. 

Now, we can say pros and cons for the old plugin.

Pros: 
* Low CPU usage
Cons: 
* High memory usage
* And incomplete Unicode handling

Also, we can say pros and cons for the new plugin.

Pros: 
* Low memory usage
* Unicode handling
* immediately subscribe channel event if it's empty on subscribe

Cons: 
* Slightly higher CPU usage rather than the old plugin's

The last topic is throughput benchmarking.

In this section, I’m gonna talk about Throughput Benchmark.

Benchmark environment has two nodes:
The collector node and the aggregator node.
They are the same setup at Benchmarking test which is used in the previous section.

Throughput Benchmark setup is as follows:
Writing five-hundred-thousand events
Increase flow rate of events step by step
* around one-hundred-sixty events per second
* around two-hundred-ninety events per second
* around three-hundred-ten events per second
* around three-hundred-twenty events per second
* And the last case is stuck.
  (Chunk bytes limit exceeds for an emitted event stream warning is generated from Fluentd...)

The left side configuration is for collector node.
In_windows_eventlog2 is used as a source and out_forward is used as an output.
The right side configuration is for the aggregator node.
This configuration is same as the previous benchmarking test.

Here is around one-hundred-sixty events per second case. The new plugin can consume benchmark channel events.

The next is around two-hundred-ninety events per second case. The new plugin also can consume benchmark channel events.

Third is around three-hundred-ten events per second case. The new plugin also can consume benchmark channel events.
But it is slightly growing up memory consumption.

The last normal case is around three-hundred-twenty events per second case. The new plugin also can consume benchmark channel events.
But it is also slightly growing up memory consumption.

Finally, here is the conclusion.

The new plugin which is named in_windows_eventlog2 does...
* Improve Unicode handling. Thanks to winevt_c gem!
* Reduce memory consumption
* Solve CPU spike after resuming operation

But still there is an issue in the new plugin:
* Slightly higher CPU usage than the old one

From throughput benchmarking result, the new plugin can handle about three-hundred event per second with default read_interval parameter.

The new plugin which is named in_windows_eventlog2’s status is…

included fluent-plugin-windows-eventlog2 version zero-point-three-point-ow.
We wanna to hear more user voices and use cases
Note that installation is a bit of harder than the older one

Let's enjoy Monitoring Windows EventLog! 

Thank you for listening!
(Any questions?)

ふむ、こうして改めて眺めてみると中々長いですね。スライドの枚数が増えてしまったので20分の持ち時間でデモは難しそうということで当日はデモを見送りました。

まとめ

Fluentd meetupが久々に東京で開催されるということで登壇しました。 Windows EventLogのEventDataへはUTF-16でValidな文字であれば、どのような文字でも入れることができます。 現代では、UTF-16の文字セットの範囲内にアルファベットやキリル文字・ギリシャ文字だけでなく、東南アジア系の文字や、ヘブライ文字・アラビア文字といったものまで入っています。 また、扱いが手ごわそうな絵文字ですらも扱うことが出来たので、in_windows_eventlog2プラグインが動き始めた時には衝撃を受けました。

文字コードをうまく扱うのは大変ですし地味ですけれども、絵文字の登場によって文字コードの話題が実例を見せた時に見た目が賑やかになることで実際に扱えていることが視覚的に分かりやすくなったのは喜ばしいですね。