Comments (12)
@RJoshlan it means that, usually, when you are using binary encoding on your chromosomes, it is rare that you would use your dataset values directly into your chromosome's genetic code. In the above example the data from the dataset was never fed directly into the NSGA-II algorithm, instead a reference was kept and the only interfacing that the dataset did with the algorithm was at the Objective Function level.
from nsga-ii.
I'm closing this issue since I hope I was able to resolve your issue. Reopen it if you have more queries.
from nsga-ii.
For jFree
error, refer to issue #8 .
from nsga-ii.
I shall provide a detailed documentation on how to use external datasets with this library shortly. Working on it.
from nsga-ii.
thanks
from nsga-ii.
Hello @RJoshlan, refer to the documentation here under the Getting Started section to understand how you can use your own custom datasets with the library. Let me know if you face any issues.
from nsga-ii.
I am using this dataset (CICIDS2017) and i am not sure how to read the dataset using GeneticCodeProducer. I tried but the code doesn't even compile. Can you help me with this because i don't know where i am going wrong. Also i tried using permutation based encoding and still it doesn't work
` public static GeneticCodeProducer geneticCodeProducerFromDataset(String path) {
return (length) -> {
List<BooleanAllele> geneticCode = new ArrayList<>();
try {
DataSource dataSource = new DataSource(path);
// Loading the dataset
Instances getData = dataSource.getDataSet();
//length = getData.numAttributes();
String geneFormat = "%0"+ calculateGeneSize(path) +"d";
length = getData.size();
while (geneticCode.size() < length) {
int data = ThreadLocalRandom.current().nextInt(1, getData.size());
String gene = String.format(geneFormat, returnBinaryValueFromInt(getData.get(data).numAttributes()));
for (char alleleChar: gene.toCharArray()) {
geneticCode.add(new BooleanAllele(returnBooleanValueFromChar(alleleChar)));
}
}
} catch (Exception e1) {
e1.printStackTrace();
}
return geneticCode;
};
}`
from nsga-ii.
@RJoshlan I shall need more information about your work before I can help you. Firstly, provide me with the dataset you are working with so that I can take a look into it. Next, give me a very brief idea about how you want to encode your chromosomes with your dataset. Third, let me know what kind of encoding you want to use with your chromosomes.
I see you are trying to use BooleanAllele
to which I assume you have tried to use binary encoding. Do keep in mind that for binary encoding, usually, you do not encode your dataset directly into the chromosome, rather keep a reference to it.
from nsga-ii.
@onclave Thanks for replying.
This is the datatset that i am using. It has 81 variables and 25000 instances. (Original has around 200,000 instances and 81 variables but i've uploaded a small one because of file size)
Dataset.zip
The chromosome encoding that i am trying to acheive is in a way which directly depends on the dataset. Eg producing values which represent the dataset variable index's from 1- 80 where chromosome length for example might be 6 alleles chosen from the 80 variables.
Also you were write that i used binary encoding but i am getting the logic wrong when trying to keep dataset as reference to chromosomes as a result i contacted you for help.
Thanks.
from nsga-ii.
If I may make a guess, you basically have a 2D dataset with 81 columns (attributes) and 25k rows (samples). You would probably want to create a population out of this. Since I don't know what your work is and what you are trying to achieve, I shall take an example problem out of it and explain how to solve that using this library and then you can use that knowledge to see how that fits to your problem set.
Problem: Let's say, considering samples, you want to do feature selection among the 81 attributes trying to select 5 marker attributes.
Solution:
Each of your chromosomes shall be binary encoded of length 81. In the beginning, randomly generate a population of N number of chromosomes. The genetic code for each chromosome represents a probable solution. The indices with Allele value 1 is considered as selected attribute and 0 is considered not selected. This is how you keep reference to your dataset with the chromosome.
Prepare your own objective functions against your dataset. They can be maximization problems or minimization problems. This library considers all objective functions to be maximization problems. Hence, for any minimization problem, take its inverse.
For each chromosome, based on its genetic code, prepare a subset of your dataset selecting only those attributes which are "1". Again, this is how you keep reference to your dataset with your NSGA-II code. NSGA-II will run the objective functions for you and the objective functions will work with your dataset to provide objective values or "fitness" for your chromosomes. NSGA-II will use these values for each chromosomes to then perform non-dominated sorting, rank assignment and crowding-distance assignment. After G generations, NSGA-II will return you the Pareto Front.
All this will be managed by NSGA-II and you do not have to actually change any code within the library. All you have to do is to write your own objective function and provide it to NSGA-II. You usually do not need to directly feed your dataset to the GeneticCodeProducer
.
For your objective functions, it takes a chromosome. So, given a chromosome, you write your own logic on how this chromosome is used to prepare a subset of your original dataset in reference to its genetic code and what operations to perform on this subset in order to return a double
value.
Once you have your Pareto Front, you use your own logic to select one chromosome as your final solution. This is not part of the NSGA-II package.
Once you have your selected solution, you use your own logic to select 5 best markers as your resultant biomarkers. This is not part of the NSGA-II package.
I hope this is explanatory enough to understand how to use your own dataset and work with this package.
from nsga-ii.
Thanks it more clear now. I do have the objective functions but i was getting it wrong when trying to encode using the dataset.
Just a point - you mentioned
""You usually do not need to directly feed your dataset to the GeneticCodeProducer"".
What you mean by this?
from nsga-ii.
@onclave Thanks for for your help.
from nsga-ii.
Related Issues (12)
- there is aproblem in binaryTournamentSelection HOT 3
- Require more info on how to add more objective functions and also implementing fitness func HOT 1
- Question about adding own objective function HOT 4
- Question about Chromosome.reset() HOT 3
- introduce a new problem to solve with nsgaII algorithm HOT 4
- Questions about soft and hard constraints HOT 4
- I am working on my nsga2 problem HOT 4
- Question on the Pareto Front HOT 2
- About the JFreeChart HOT 1
- Caused by: java.lang.ClassNotFoundException: org.jfree.chart.ui.ApplicationFrame HOT 5
- Can i adapt your code for Multi-objective NSGA - 2 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nsga-ii.