Giter VIP home page Giter VIP logo

Comments (2)

thompsonb avatar thompsonb commented on August 30, 2024

I converted from hunalign ladder-style to the bleualign format. I believe this is the code I used:

def reformat(ladder_file, src_len, tgt_len):
    alignments = []
    current_alignment = ([], [])
    prev_a1, prev_a2 = None, None

    for line in open(ladder_file, 'r', encoding='utf-8'):
        fields = line.strip().split('\t')
        a1, a2 = int(fields[0]), int(fields[1])
            
        if a1 != prev_a1 and a2 != prev_a2 and current_alignment != ([], []):
            alignments.append(current_alignment)
            current_alignment = ([], [])
            
        current_alignment[0].append(a1)
        current_alignment[1].append(a2)
        prev_a1, prev_a2 = a1, a2
    
    alignments2 = []
    xx, yy = [], []
    for a1, a2 in alignments:
        x1 = sorted(list(set(a1)))
        x2 = sorted(list(set(a2)))
        alignments2.append((x1, x2))  # tuple of lists
        xx.extend(x1)
        yy.extend(x2)
    
    # add deletions/insertions (*not* in order) 
    xx, yy = set(xx), set(yy)
    for x in range(src_len):
        if x not in xx:
            alignments2.append(([x, ], []))
    for y in range(tgt_len):
        if y not in yy:
            alignments2.append(([], [y, ]))

    return alignments2 

from vecalign.

tamohannes avatar tamohannes commented on August 30, 2024

@thompsonb can you please upload the corpuses with their corresponding alignment files on the repo ?

from vecalign.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.