Comments (1)
索引的锁使用错误,我们这里存在多个线程同时访问同一个索引的情形,故要注意
修正后的代码如下
public static Directory readOnlyOpen(File path, LockFactory lockFactory) throws IOException
{
Object lock=new Object();
synchronized (maplock) {
String key=path.getAbsolutePath();
lock=localLock.get(key);
if(lock==null)
{
lock=new Object();
localLock.put(key, lock);
}
}
synchronized (lock) {
System.out.println("LinkFSDirectory readOnlyOpen "+path.getAbsolutePath());
File links=new File(path,"indexLinks");
File hlinks = new File(path, "hdfsIndexLinks");
if(!links.exists()&&hlinks.exists())
{
return hdfsReadOnlyOpen(path, lockFactory);
}
List<Directory> dirlist=new ArrayList<Directory>();
System.out.println("links file is"+links.getAbsolutePath());
if(links.exists())
{
FileReader freader= new FileReader(links);
BufferedReader br = new BufferedReader(freader);
String s1 = null;
while ((s1 = br.readLine()) != null) {
if(s1.trim().length()>0)
{
dirlist.add(LinkFSDirectory.open(new File(s1)));
System.out.println("LinkFSDirectory readOnlyOpen add links "+s1);
}
}
br.close();
freader.close();
}
File workerspace=new File(path,getWorkDir(path.getAbsolutePath()));
deleteDirectory(workerspace);
workerspace.mkdirs();
FSDirectory dir= open(workerspace,lockFactory);
IndexWriter writer=new IndexWriter(dir, null,new KeepOnlyLastCommitDeletionPolicy(), MaxFieldLength.UNLIMITED);
writer.setMergeFactor(512);
writer.setUseCompoundFile(false);
if(dirlist.size()>0)
{
Directory[] dirs=new Directory[dirlist.size()];
writer.addIndexesNoOptimize(dirlist.toArray(dirs));
}
writer.close();
return dir;
}
}
private static HashMap<String, Integer> indexmap=new HashMap<String, Integer>();
public static synchronized String getWorkDir(String path)
{
Integer index=indexmap.get(path);
if(index==null)
{
index=0;
}
index++;
if(index>100)
{
index=0;
}
indexmap.put(path, index);
return "workerspace_"+index;
}
from higo.
Related Issues (20)
- 海狗的solr心跳使用set_ephemeral_node
- 分区除了default,day,month外支持single分区 HOT 1
- 海狗的shards支持replication
- 建表语句使用Sql,之前通过编辑schema.xml的方式太麻烦了 HOT 1
- cacheField根据目录打散,利用多磁盘
- adhoc二期 每个表起始时间的接口
- 监控添加进程ID
- 重构-监控每个分区,以及起始几天的记录数这段代码独立出来
- maven jar包不在完全合并成独立的 HOT 1
- thedate 默认从文件路径中读取 HOT 1
- skipTo使用的是terms,这里面用到了大量的clone HOT 1
- sql的order by解析好像有问题,没起作用 HOT 2
- 一次跨越很多个分区的时候-查询报错的解决
- 超过1万组则为近似的排序和计算,有可能存在计算的结果不正确的情况 HOT 4
- 多列groupby 目前只能按照统计字段排序,和group整体排序
- 超过1万组则为近似的排序和计算,排序排序考虑周边的200个
- 排序整理重构 HOT 2
- 对外SQL的接口类名更改为HigoService
- 云梯升级后 我没有重新load配置文件 我记录下此BUG
- 多列group by的排序要按照不同的数据类型进行排序
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from higo.