Comments (7)
In view of the last comments (and the widespread use of R and Tidyverse packages in digital humanities projects), do you think the issue could be re-opened, @vspinu?
from lubridate.
Well, not that I use this regularly but I just worked on a dataset that had negative years i.e., BC, included and had quite some difficulties to deal with it. E.g. the following doesn't work:
> lubridate::ymd("-2255-01-01")
[1] "2255-01-01"
> lubridate::parse_date_time(-2255, "Y")
[1] NA
Warning message:
All formats failed to parse. No formats found.
since a Date object with positive year is returned. That said, the following works:
> lubridate::ymd("0000-01-01") - lubridate::years(2255)
[1] "-2255-01-01"
which made me write a helper function that deals with negative years.
from lubridate.
I can confirm that for anyone working on ancient periods (e.g. Greek and Roman periods) this feature would be very useful. This problem/choice is currently a deal breaker for Tidyverse and lubridate
enthusiasts using R in Digital Humanities projects and courses. I'd be happy to test the modified functions.
from lubridate.
Hi! Thanks for the package and the work done around. If someone considers implementing BCE dates, or dealing with those with the package as is (1.7.9), here are some thoughts about problems caused by a phantom year zero.
Dealing with (non existing) "Year zero"
Explanations
If anyone is considering dealing with "before common era" dates in lubridate, be aware that Year zero doesn't exist (for historians I mean, there is Year -1 and then Year 1, see for example the Wikipedia chronology – also note that's for the Julian calendar, that could have minor conflicts with the Gregorian calendar we use nowadays; you would find more details about this on Wikipedia), and that can cause a few problems.
Quick clarification for people unfamiliar with those notations:
- "CE" stands for "Common Era", which is a "de-christianisation" of long and still used "AD", "Anno Domini" (so dates CE could be seen as "positive years")
- "BCE" stands for "Before Common Era", equivalent to "BC", "Before Christ" ("negative years").
eg. Year 2021 (happy new year btw!) would be "2021 CE". Socrates died in 399 BCE.
For instance, lubridate follows the ISO 8601 (version 8601:2004 I presume? BCE dates could be handled with ISO 8601:2019 but the free-access part of the doc is unclear about it), which starts at 0000-01-01
, that is the 1st January of 1 BCE (Year -1).
This writing is confusing because it leaves to think "0000-01-01
" is Year 0, and that "-001-01-01
" is Year -1 when it's Year -2, and can cause problems to compute durations (see code below).
That aside, if encountered, "0 CE/AD" or "0 BCE/BC" should probably be parsed into Year -1.
References: Wikipedia (ISO 8601, Year zero, 1 BC, Common Era...)
Some code to make my point
(Licensed under WTFPL: Do What The Fuck You Want to)
pacman::p_load(lubridate)
pacman::p_version(lubridate)
#> [1] '1.7.9'
a <- ymd("0001-01-01")
a
#> [1] "0001-01-01"
# Year 1, no problem
b <- ymd("0000-01-01") - years(1)
b
#> [1] "-001-01-01"
# It is Year -1?
# No, it's -2 even if printed (-001-01-01),
# since ymd("0000-01-01") is already Year -1.
# The problem appears if we compute duration between the two
as.duration(a - b)
#> [1] "63158400s (~2 years)"
# But there is only one year between 1st January -1 and 1st January 1!
# since year zero doesn't exist.
Let's illustrate with Augustus dates:
- birth: 23 September 63 BCE
- death: 19 August 14 CE
- age at death: 75
aug_birth <- ymd("0000-09-23") - years(63)
aug_death <- ymd("0014-08-19")
age <- aug_death - aug_birth
as.duration(age)
#> [1] "2426889600s (~76.9 years)"
# That's one year too much!
# The correct writing would be:
aug_birth <- ymd("0000-09-23") - years(63 - 1)
So a correct helper function would be, to parse BCE yyyy-mm-dd:
parse_bce_ymd <- function(str) {
regex <- "(\\d{4})(-\\d{2}-\\d{2})"
match <- stringr::str_match(str, regex)
years_n <- readr::parse_number(match[, 2]) - 1 # Beware the -1 here
right_side <- match[, 3]
date <- ymd(paste0("0000-",right_side)) - years(years_n)
return(date)
}
# Test the function.
aug_birth <- parse_bce_ymd("0063-09-23")
aug_death <- ymd("0014-08-19")
age <- aug_death - aug_birth
as.duration(age)
#> [1] "2395353600s (~75.9 years)"
# Yay that's correct!
Still, lubridate print the BCE date with one year less (less in absolute value, that is one year ahead here) than the "real one", as if a zero-year existed, which is misleading.
aug_birth
#> [1] "-062-09-23"
from lubridate.
I am closing this. If someone insists that this should be done and is useful, please reopen.
from lubridate.
Hi. I know this is ancient, but is anyone thinking about this bce / b.c. Issue, or does lubridate now handle this natively?
from lubridate.
What's the problem more concretely. What's the user pattern people have in mind?
from lubridate.
Related Issues (20)
- `parse_date_time()` cannot match missing zeroes
- Inconsistent behavior of `parse_date_time()` inside `dplyr::mutate()` HOT 2
- month() and otehrs fail on objects from class 'timeDate'
- ymd_hms() function left-pads some dates that have missing "seconds" values
- round_date in 0.1 sec doesn't work correctly
- FR: int_overlaps with exclusive endpoints
- unique() always zero for periods HOT 1
- Implement Set Operations methods for Dates HOT 1
- Parsing dates with `my` seems to have a limit size
- Do we need something like `%m-%` to subtract years from leap 02/29? HOT 1
- Cannot compute `<date> + lubridate::year(1)` when `<date>` is a leap day. HOT 1
- m:s:ms time data
- `dmy()` not failing (and returning incorrect date) on wrong date format
- ceiling_date() issue when using multi units
- Fractional Seconds with conversion and rounding/truncation?
- mdy("04 July 2019") GIVES "2019-04-20" : Instead should give an error.
- data.table merge doesn't work with intervals HOT 1
- yearmonth() throws an error with C_force_tz
- Feature request: Excel date origin
- how about adding lubridate hex log to tidyverse main page? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lubridate.