creeperyang / id3-parser Goto Github PK

View Code? Open in Web Editor NEW

56.0 5.0 13.0 8.05 MB

A pure JavaScript id3 tag parser.

TypeScript 100.00%

id3-parser id3 id3v2 id3v1 id3-reader

id3-parser's Introduction

id3-parser

A pure JavaScript id3 tag parser.

Installation & Usage

import parse from 'id3-parser';

const tag = parse(buffer|uint8Array|number[]);
console.log(tag);

If you want to use inside browser, you may need to use with webpack or other pack utils. The lib itself is writtern with typescript and traformed to es2015.

import parse from 'id3-parser';
import { convertFileToBuffer, fetchFileAsBuffer } from 'id3-parser/lib/util';

// You have a File instance in browser
convertFileToBuffer(file).then(parse).then(tag => {
    console.log(tag);
});
// Or a remote mp3 file url
fetchFileAsBuffer(url).then(parse).then(tag => {
    console.log(tag);
});

API

In most cases, you always want input an array of number (binary data) and then get the id3 tag info.

import parse, { parseV1Tag, parseV2Tag } from 'id3-parser';
parse(array) // ==> tag

parse(buffer|uint8Array|number[])

Parse id3v1 and id3v2.3 tags from a buffer(Node Buffer instance), uint8Array(Uint8Array instance).

The typical parsed tag (return value) would be like:

{
    version: {
        v1: false, // means no id3v1 tag
        v2: {
            major: 2,
            minor: 3,
            revision: 0,
            flags: {
                unsync: false,
                xheader: false,
                experimental: false,
            },
        },
    },
    artist: '王菲',
    album: '匆匆那年',
    title: '匆匆那年',
    year: '2014-11-05',
    comments: [{
        description: '',
        language: 'eng',
        value: 'V1.0',
    }],
    lyrics: [{
        description: 'h',
        language: 'eng',
        // tslint:disable-next-line:max-line-length
        value: '\n[00:01.92]匆匆那年（电影《匆匆那年》主题曲）\n[00:02.19]作词：林夕\n[00:02.63]作曲：梁翘柏\n[00:02.98]演唱：王菲\n[00:04.44]\n[00:28.71]匆匆那年我们  究竟说了几遍  再见之后再拖延\n[00:34.21]可惜谁有没有  爱过不是一场  七情上面的雄辩\n[00:39.90]匆匆那年我们  一时匆忙撂下  难以承受的诺言\n[00:45.45]只有等别人兑现\n[00:49.53]\n[00:51.17]不怪那吻痕还  没积累成茧\n[00:56.71]拥抱着冬眠也没能  羽化再成仙\n[01:02.33]不怪这一段情  没空反复再排练\n[01:07.88]是岁月宽容恩赐  反悔的时间\n[01:16.20]\n[01:18.92]如果再见不能红着眼  是否还能红着脸\n[01:24.71]就像那年匆促  刻下永远一起  那样美丽的谣言\n[01:29.97]如果过去还值得眷恋  别太快冰释前嫌\n[01:36.03]谁甘心就这样  彼此无挂也无牵\n[01:41.67]我们要互相亏欠  要不然凭何怀缅\n[01:57.64]\n[01:58.81]匆匆那年我们  见过太少世面  只爱看同一张脸\n[02:04.19]那么莫名其妙  那么讨人欢喜  闹起来又太讨厌\n[02:09.85]相爱那年活该  匆匆因为我们  不懂顽固的诺言\n[02:15.34]只是分手的前言\n[02:19.61]\n[02:20.99]不怪那天太冷  泪滴水成冰\n[02:26.63]春风也一样没  吹进凝固的照片\n[02:32.26]不怪每一个人  没能完整爱一遍\n[02:37.86]是岁月善意落下  残缺的悬念\n[02:47.25]\n[02:48.88]如果再见不能红着眼  是否还能红着脸\n[02:54.71]就像那年匆促  刻下永远一起  那样美丽的谣言\n[02:59.99]如果过去还值得眷恋  别太快冰释前嫌\n[03:06.05]谁甘心就这样  彼此无挂也无牵\n[03:11.02]\n[03:11.27]如果再见不能红着眼  是否还能红着脸\n[03:17.53]就像那年匆促  刻下永远一起  那样美丽的谣言\n[03:22.31]如果过去还值得眷恋  别太快冰释前嫌\n[03:28.52]谁甘心就这样  彼此无挂也无牵\n[03:34.15]我们要互相亏欠\n[03:39.73]我们要藕断丝连\n[03:51.48]\n',
    }],
    image: {
        type: 'cover-front',
        mime: 'image/jpeg',
        description: 'e',
        data: buffer // just the cover's binary data
    },
}

parseV1Tag(buffer|uint8Array|number[]) | parseV2Tag(buffer|uint8Array|number[])

Only parse id3v1 tag or id3v2 tag.

License

id3-parser's People

Stargazers

Watchers

Forkers

shelsonjava azariah001 mydiscogr mrnope planetaonline rifftv sergiolenza tbrockman lucas-gaitzsch rhzd chisw-forked developercyrus swustcwh

id3-parser's Issues

id3-parser v2

Rewrite with typescript.
Keep to spec http://id3.org/d3v2.3.0, handle more edge cases.
Adjust structure and API, easier to use.

Memory gobbled when using a buffer as input

For some reason when parsing file buffers directly into the id3 parser the memory doesn't get released, ever. Well until the script is killed and even then. It's a known issue with buffers because they're actually points to C++ memory locations containing the actual buffer. So setting a buffer to null or attempting to overwrite it only results in you loosing access to the original buffer it doesn't release the memory.

So the solution! Which I'll be updating in the readme, don't parse buffers directly into the id3 parser.

id3.parse(new Uint8Array(buffer)).then(function(tags) {

});

Still takes a while for the garbage collection to clean up the buffers but.... they do get cleaned up. I just went from using ~400Mb of RAM for my test file set down to 91Mb (on server I've gone from 600-700Mb of memory used to 140Mb) so I'm fairly happy now but... if the parser is going to support raw buffers as input we need to make sure it is dereferencing them properly.

I presume the issue could be solved if the buffer input were immediately converted into a Unit8Array just as I'm doing outside the script but... I'm obviously not certain of that at this point. I'll look into it once my semester is done in a couple of weeks.

calcFrameSize returning NaN

Inside of parseV2Frames, I noticed that calcFrameSize would sometimes return NaN or a really large number and cause issues inside of the loop. I fixed the issue by changing to the following:

if(isNaN(size) || size === 0 || size > buffer.length - position - 10) {
break;
}

Let me know what you think.

Thanks for this library

I just want to express my thanks for this lib. Saves me a ton of time! Thanks!

A possible bug about APIC frame

Parsed APIC frame is like:

image: 
   { type: 'cover-front',
     mime: '浩条⽥灪来',
     description: 'e',
     data: <Buffer ff d8 ff e0 00 10 4a 46 49 46 00 01 01 00 00 01 00 01 00 00 ff fe 00 3c 43 52 45 41 54 4f 52 3a 20 67 64 2d 6a 70 65 67 20 76 31 2e 30 20 28 75 73 69 ... > } }

mime seems not correct.

{'user-defined-text-information': '哾愀最最椀渀最琀椀洀攀',} looks incorrectly parsed.

A possible bug with user defined text information frame.

Tag doesn't parse.

I did some debugging and seems like this tag doesn't parse for a couple issues:

Length of the buffer after the header is computed to be way too short (in parseV2FromBuffer)
Length of slice, after initial slice is always 0 so all frames are skipped (in calcFrameSize)

I am passing a Uint8Array:

ID3.parse(samples[0].unit).then(tag => {
     console.log(tag)
})

Specifically this is the data I get out of hls.js on timed metadata events on Hls.Events.FRAG_PARSING_METADATA.

Here is the Uint8Array for testing:

test = [73,68,51,4,0,0,0,0,0,63,80,82,73,86,0,0,0,53,0,0,99,111,109,46,97,112,112,108,101,46,115,116,114,101,97,109,105,110,103,46,116,114,97,110,115,112,111,114,116,83,116,114,101,97,109,84,105,109,101,115,116,97,109,112,0,0,0,0,0,191,178,114,97,73,68,51,4,0,0,0,0,3,84,84,82,83,78,0,0,0,27,0,0,3,83,116,114,101,97,109,83,32,83,116,114,101,97,109,105,110,103,32,69,110,99,111,100,101,114,0,84,82,83,79,0,0,0,23,0,0,3,72,105,45,70,105,32,73,110,116,101,114,110,101,116,32,83,116,114,101,97,109,0,84,73,84,50,0,0,0,30,0,0,3,77,101,97,110,44,32,71,114,101,101,110,44,32,75,105,108,108,105,110,103,32,77,97,99,104,105,110,101,0,84,80,69,49,0,0,0,10,0,0,3,79,118,101,114,107,105,108,108,0,87,88,88,88,0,0,0,38,0,0,3,97,114,116,119,111,114,107,85,82,76,95,54,52,48,120,0,104,116,116,112,58,47,47,103,105,109,109,101,114,97,100,105,111,46,99,111,109,84,70,76,84,0,0,0,9,0,0,3,77,80,71,47,65,65,67,0,84,88,88,88,0,0,0,51,0,0,3,99,114,98,0,83,116,114,101,97,109,83,32,76,105,118,101,32,72,84,84,80,32,69,110,99,111,100,101,114,32,83,101,114,118,105,99,101,32,40,118,32,49,46,50,46,50,46,57,52,41,84,88,88,88,0,0,0,51,0,0,3,101,110,99,0,83,116,114,101,97,109,83,32,76,105,118,101,32,72,84,84,80,32,69,110,99,111,100,101,114,32,83,101,114,118,105,99,101,32,40,118,32,49,46,50,46,50,46,57,52,41,84,88,88,88,0,0,0,33,0,0,3,100,101,118,0,76,105,110,101,32,50,32,40,86,105,114,116,117,97,108,32,65,117,100,105,111,32,67,97,98,108,101,41,84,88,88,88,0,0,0,6,0,0,3,97,111,116,0,50,84,88,88,88,0,0,0,8,0,0,3,97,100,114,0,50,53,54,84,88,88,88,0,0,0,10,0,0,3,97,115,114,0,52,56,48,48,48,84,88,88,88,0,0,0,6,0,0,3,97,99,104,0,50,84,88,88,88,0,0,0,26,0,0,3,99,114,100,0,50,48,49,55,48,50,48,52,32,50,51,58,51,50,58,53,51,32,85,84,67]

If you convert to a string you can see it has the v2 tags in it:

unit = new Uint8Array(test)
tagAsString = new TextDecoder('utf-8').decode(unit)

Roughly:

"ID3�?PRIV5com.apple.streaming.transportStreamTimestamp��raID3��TTRSN��StreamS Streaming EncoderTRSO��Hi-Fi Internet StreamTIT2��Mean, Green, Killing MachineTPE1
�OverkillWXXX&�artworkURL_640xhttp://foobar.comTFLT �MPG/AACTXXX3�crbStreamS Live HTTP Encoder Service (v 1.2.2.94)TXXX3�encStreamS Live HTTP Encoder Service (v 1.2.2.94)TXXX!�devLine 2 (Virtual Audio Cable)TXXX��aot2TXXX��adr256TXXX
�asr48000TXXX��ach2TXXX��crd20170204 23:32:53 UTC"

I realize all this might be because the minor id3v2 version is 4 and 2.4 does not appear to be supported yet. Is that correct? If so can you recommend another approach or outline what is needed to support 2.4 and maybe I'll try to implement it?

Handling of null-terminated strings

Though string values in tags should be padded with spaces, I have found some files in my library with C-style null-terminated strings.

v1 headers use ´whiteRe´. The regexp replaces blanks or null at the end of the string, but in C, nulls might be followed by whatever trash remained in the buffer, not necessarily spaces. Thus, this regexp might concatenate whatever non-null and non-blank trash lies after the first null with the actual data. The string should be truncated at the first null with whatever remains ignored.

I had no trouble with whiteRe in my mp3s so I wasn't able to test alternatives, but I would change the regexp to:

/(^\s+|\s+$|\0.*$)/g

Literally, one or more blanks at either end or a single null followed by anything up to the end.

v2 headers don't truncate strings at nulls at all.

stringUtils.js provides readNullTerminatedString. It would be good to use that. Or, possibly faster, use indexOf('\x00') and if it is not -1 use substr(0, index) to truncate it. Or the same whiteRe as used for the v1 headers should work. Right now I am using the indexOf version and it works for me.

可以增加对flac的解析吗

Type Definition File

First off - this is by far the most simple and effective id3 parser I've found for node, thank you!

I've created a typing for this for use with typescript. It is also registered within the typings registry so it can be pulled into projects as a dependency.

You can find it here - https://github.com/dvdcxn/typed-id3-parser

I'm not sure how familiar you are with typescript but I reckon folk will find it useful. Would it be worth updating the readme with information on how to install it?

Readme provides no example for usage on local files

I'm trying to implement this library in my node project however I cannot figure out how to reference my local files. The documentation says (buffer|file|uint8Array) but only gives an example for unit8Array I think I could figure out how to do buffer pretty easily but... I just have a list of local files I need the script to go through and parsing a file path as a string results in an error soooo how do I go about it?

Normally I can find examples elsewhere (ie StackOverflow) but I can't find any for this library :(

File with no id3 tags returning input instead of emitting error

When parsing a file with no actual ID3 tag info, I noticed that this simply results in the buffer fed into the function being returned. Presumably this is the case with other types of input.

I was wondering if it would be sensible for this to throw an error, or make such behaviour configurable. It would make handling this exceptional behaviour within the promise chain a little cleaner, especially when dealing with mapped files.

See below code for an example of how this would behave in practice - presumably if the version object is false, then the stats failed, thus i throw an error inside the handler.

function analyseFile(fileName) {
	let _readFile = bluebird.promisify(readFile);
	return _readFile(fileName)
		.then(id3.parse)
		.then(stats => {	
			if(stats.version){	
				return stats;
			}
			else{
				throw `Couldn\'t determine id3 data for file ${fileName}`;
			}		
		})
		.catch(console.error);
}

function analyseFiles(globs) {
	let _glob = bluebird.promisify(glob);
	return _glob(globs)
		.map(analyseFile);
}

Need Support for AngularJS/Javascript Plain

Need Support for AngularJS/Javascript Plain
as we need such capabilities in our current project.

Support chapter frames?

CHAP frames would be a nice addition!

TXXX type metadata added to ID3.4 by ffmpeg cannot be parsed

Hi! 👋

Firstly, thanks for your work on this project! 🙂

Today I used patch-package to patch [email protected] for the project I'm working on.

Without this little assumption it was impossible to parse metadata of TXXX type made by ffmpeg version 4.4
cat no-metadata.mp3 | ffmpeg -i pipe:0 -vn -acodec mp3 -metadata text="Привет мир!" -f mp3 test.mp3

Here is the diff that solved my problem:

diff --git a/node_modules/id3-parser/lib/parsers/v2parser.js b/node_modules/id3-parser/lib/parsers/v2parser.js
index 3fc9bb3..162ca24 100644
@@ -180,7 +181,7 @@ function parseFrame(bytes, minor, size) {
         // Value < text string according to encoding >
         if (header.id === 'TXXX') {
             variableStart = 11;
-            variableLength = utils_1.getEndpointOfBytes(bytes, encoding, variableStart) - variableStart;
+            variableLength = utils_1.getEndpointOfBytes(bytes, 0, variableStart) - variableStart;
             var value = {
                 description: utils_1.readBytesToString(bytes.slice(variableStart), encoding, variableLength),
                 value: '',

This issue body was partially generated by patch-package.

It's entirely written in TypeScript

It doesn't seems that any of the code is written in JavaScript.
I think without a TypeScript compiler you could not use the code!

Update your repository information ASAP!

what would be involved in implementing unsync support?

    // Currently do not support unsynchronisation
    if (flags.unsync) {
        throw new Error('no support for unsynchronisation');
    }