alexholmes / json-mapreduce Goto Github PK
View Code? Open in Web Editor NEWInputFormat that can split multi-line JSON
License: Apache License 2.0
InputFormat that can split multi-line JSON
License: Apache License 2.0
json-mapreduce does not support UTF-8. The problem is in src/main/java/com/alexholmes/json/parser/PartitionedJsonParser.java
that does not use a UTF-8 supported InputStream.
This issue is not related to your code but more around creating runnable jar from eclipse and then using the command to run your ExampleJob code.
hadoop jar json-mapreduce-runnable.jar com.alexholmes.json.mapreduce.ExampleJob in out
I have added my own example and that requires some external jars. So in order to export the project I had to choose runnable jar instead of jar. Now when I create a runnable jar (from eclipse) I get error for my program. So I tried creating runnable jar for ExampleJob (launch config) and test and I see that it fails with following stack
Please note that both in and out directories are not present.
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.io.IOException: Input directory 'com.alexholmes.json.mapreduce.ExampleJob' exists - please remove and rerun this example
at com.alexholmes.json.mapreduce.ExampleJob.writeInput(ExampleJob.java:98)
at com.alexholmes.json.mapreduce.ExampleJob.run(ExampleJob.java:120)
However when I create normal jar (not a runnable jar) then
hadoop jar json-mapreduce.jar com.alexholmes.json.mapreduce.ExampleJob in out
behaves as expected. Let me know if you know anything here.
@OverRide
protected boolean isSplitable(JobContext context, Path file) {
CompressionCodec codec =
new CompressionCodecFactory(context.getConfiguration()).getCodec(file);
return codec == null;
}
error:
xception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at com.alexholmes.json.mapreduce.MultiLineJsonInputFormat.isSplitable(MultiLineJsonInputFormat.java:41)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:352)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.