The mgfsm from uma-pi1

Problem when executing mgfsm in distributed mode

Hi,
we are trying to execute mgfsm in distributed mode but the translatedFS folder into the output folder contains two empty files: SUCCESS and part* .
No problems found when executing in sequential mode with the same input file.
What's wrong?

Output with supporting sequence IDs

I was browsing through the code but could not really pinpoint the place where I could potentially collect the supporting sequence IDs for each frequent pattern (and output them).

Is this possible to do in the algorithm (without too much effort)? If yes, where could I potentially start?

No output file in sequential mode

Thanks for the very nice algorithm you have designed.
I was trying to run the code in sequential mode on a Windows computer.
I just tried to run the algorithm on the very simple example you have provided for testing first.

It seems that the algorithm runs but the output file is not created (the output folder is created but it remains empty, there is no file in it)

I use this command in cmd to run the algorithm

java -jar target/mgfsm-0.0.1-SNAPSHOT-jar-with-dependencies.jar -i C:/MGFSM/DATA/Example.txt/ -o SAMPLE_OUTPUT2 -s 2 -g 2 -l 2 -m s

I was wondering if you could please let me know how I can fix this issue, I really need to use your algorithm on my data - seems very interesting.

Thanks,
Vahid

Below is what I receive when executing the above command in cmd:

20/03/19 15:35:09 ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:355)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:370)
at org.apache.hadoop.util.Shell.(Shell.java:363)
at org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(GenericOptionsParser.java:438)
at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:484)
at org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:170)
at org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:153)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:64)
at de.mpii.fsm.driver.FsmDriver.main(FsmDriver.java:558)
20/03/19 15:35:09 INFO common.AbstractJob: Command line arguments: {--endPhase=[2147483647], --execMode=[s], --gamma=[2], --indexing=[none], --input=[C:/MGFSM/DATA/], --lambda=[2], --numReducers=[90], --output=[SAMPLE_OUTPUT2/], --partitionSize=[10000], --startPhase=[0], --support=[2], --tempDir=[temp], --type=[a]}
20/03/19 15:35:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Deleting existing output path
The intermediate output will be written
to this temporary path :C:\Users\vahid\AppData\Local\Temp\MG_FSM_INTRM_OP_6557093669574892418
The temporary output associated with the internal map -reduce
jobs will be written to this temporary path :C:\Users\vahid\AppData\Local\Temp\MG_FSM_TEMP_OP_4937220842338026946
java.lang.NullPointerException
at java.lang.ProcessBuilder.start(Unknown Source)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:808)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:791)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:656)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:490)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:462)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:428)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:775)
at de.mpii.fsm.driver.SequentialMode.encodeAndMine(SequentialMode.java:331)
at de.mpii.fsm.driver.SequentialMode.runSeqJob(SequentialMode.java:279)
at de.mpii.fsm.driver.FsmDriver.run(FsmDriver.java:512)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at de.mpii.fsm.driver.FsmDriver.main(FsmDriver.java:558)

Does this tool support non-English languages?

Will this tool work with non-English text inputs? Do I need to modify the tokenization or make any other adjustments?

uma-pi1 / mgfsm Goto Github PK

mgfsm's People

Contributors

Stargazers

Watchers

Forkers

mgfsm's Issues

Problem when executing mgfsm in distributed mode

Output with supporting sequence IDs

No output file in sequential mode

Does this tool support non-English languages?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent