---
title: "Working with Different Types of Time Indices Using `tind` Class"
author: "Grzegorz Klima"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Working with Different Types of Time Indices Using `tind` Class}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---
```{r, echo = FALSE}
library("tind")
```

`tind` class is designed to represent time indices of different types and
perform computations with them. Indices are represented as vectors of integers
or doubles. The following types of time indices are supported:

* years,
* quarters,
* months,
* weeks (ISO 8601),
* dates,
* time of day,
* date-time,
* arbitrary integer and numeric indices.


Time indices can be constructed via calls to `tind` constructor, using `as.tind`
methods, or by calling `parse_t` and `strptind` functions for parsing
arbitrary time index formats.

Before proceeding, let us load the package.
```{r color}
library("tind")
```


## Years

The internal code for years is `y`.

The simplest way to construct year indices is to invoke `tind` constructor
with a single argument `y`.

```{r}
(ys <- tind(y = 2010:2020))
```

In `as.tind` method, integers in range 1800--2199 are automatically
interpreted as years. For instance:

```{r}
(ys <- as.tind(2010:2020))
```

To let `as.tind` know that numbers outside 1800--2199 range should be
interpreted as years, `type` argument has to be provided as in:
```{r}
(ys <- as.tind(c(1700, 1800, 1900, 2000, 2100), type = "y"))
```

Convenience function `as.year` is a shortcut for `as.tind(*, type = "y")`:
```{r}
(ys <- as.year(c(1700, 1800, 1900, 2000, 2100)))
```

Four-digit character strings in the format `YYYY` are also automatically
interpreted as year indices:
```{r}
(ys <- as.tind(c("1700", "1800")))
```


Years as four-digit strings are indicated by `%Y` format specifier. The
last examples could also be written as:
```{r}
(ys <- as.tind(c("1700", "1800", "1900", "2000", "2100"), format = "%Y"))
```

Two-digit year indices are indicated by `%y` format specifier. By default,
numbers in range 69--99 are interpreted as years 1969--1999 and numbers 00--68
as years 2000--2068:
```{r}
(ys <- as.tind(c("98", "99", "00", "01", "02"), format = "%y"))
format(ys, "%Y")
format(ys, "%y")
```
`as.year` guesses short year format:
```{r}
(ys <- as.year(c("98", "99", "00", "01", "02")))
```
Treatment of two-digit years is controlled by `tind.abbr.year.start` option.
```{r}
options("tind.abbr.year.start")
```



## Quarters

The internal code for quarters is `q`.

Quarters can be constructed using `tind` constructor with arguments `y` and `q`.
Arguments of `tind` constructor are recycled if necessary.

```{r}
(qs <- tind(y = rep(2020:2023, each = 4), q = 1:4))
```


The default format for quarters is `"%YQ%q"`, where `%q` format specifier
is a `tind` extension giving quarter number. This will be automatically recognized:
```{r}
(qs <- as.tind(c("2020Q1", "2020Q2", "2020Q3", "2020Q4")))
format(qs)
format(qs, "%YQ%q")
```

Less popular formats can be parsed using combinations of `%Y` (or `%y`)
and`%q` specifiers. Consider `YYYY.Q` format:
```{r}
as.tind("2023.2", format = "%Y.%q")
```

One can also specify the order of index components using `order` argument:
```{r}
as.tind("2023.2", order = "yq")
```

If order is `yq`, quarters will be automatically parsed if `type = "q"` is set:
```{r}
as.tind(c("2020 1", "2020 2", "2020 3", "2020 4"), type = "q")
```

Convenience function `as.quarter` is a shortcut for `as.tind(*, type = "q")`:
```{r}
as.quarter(c("2020 1", "2020 2", "2020 3", "2020 4"))
```


Packages `stats` and `zoo` (class `yearqrt`) represent quarters as year
fractions with e.g., `2020.0`, `2020.25`, `2020.5`, `2020.75` representing
`2020Q1`, `2020Q2`, `2020Q3`, `2020Q4` respectively.
Conversion from this format can be done using `yf2tind` function.
```{r}
yf2tind(2020 + (0:3) / 4, "q")
```


## Months

The internal code for months is `m`.

Months can be constructed using `tind` constructor with arguments `y` and `m`.
Arguments of `tind` constructor are recycled if necessary.

```{r}
(ms <- tind(y = 2023, m = 1:12))
```


The default format for months is `"%Y-%m"`, where `%m` format specifier
represents month as a two-digit number.
```{r}
format(ms)
format(ms, "%Y-%m")
```


If order is `ym`, months will be automatically parsed if `type = "m"` is set:
```{r}
as.tind("2023-11", type = "m")
```


Convenience function `as.month` is a shortcut for `as.tind(*, type = "m")`:
```{r}
as.month("2023-11")
```


Format specifier `%b` denotes abbreviated month name as in the following example.
See documentation of `month_names` for discussion of locale settings.
```{r}
(shrtms <- format(ms, "%b %y", locale = "C"))
```

The above can be parsed using format specification or using order specification
(with month first).
```{r}
as.tind(shrtms, format = "%b %y", locale = "C")
as.tind(shrtms, order = "my", locale = "C")
```



Packages `stats` and `zoo` (class `yearmon`) represent months as year
fractions with e.g., `2020.0`, `2020.0833`, `2020.1667`, `2020.25`
representing `2020-01`, `2020-01`, `2020-03`, `2020-04` respectively.
Conversion from this format can be done using `yf2tind` function.
```{r}
yf2tind(2020 + (0:11) / 12, "m")
```


## Weeks (ISO 8601)

`tind` supports ISO 8610 weeks i.e. weeks starting on Monday with the first week
in a year being the week with the first Thursday in a year.
See https://en.wikipedia.org/wiki/ISO_week_date.


The internal code for weeks is `w`.

Weeks can be constructed using `tind` constructor with arguments `y` and `w`.
Arguments of `tind` constructor are recycled if necessary.

```{r}
(ws <- tind(y = 2024, w = 1:52))
```

The default format for weeks is `%G-W%V`, where `%G` is the week-based
year (ISO week-numbering year, ISO year) and `%V` ISO week number.


```{r}
format(ws)
format(ws, format = "%G-W%V")
```


Note that you cannot use `%Y`, `%W`, and `%U` specifiers with weeks. ISO
week-numbering year may differ from Gregorian (i.e. calendar) year `%Y`.
`%W` and `%U` formats refer to non-ISO weeks and are not supported.


`as.week` (a shortcut for `as.tind(*, type = "w")`) automatically recognizes
this format.

```{r}
as.week("2024-W51")
```



## Dates

Dates (code `d`) can be most easily constructed from `y`, `m`, and `d`
components passed to `tind` constructor. Order of arguments is irrelevant.

```{r}
tind(m = 3, d = 15, y = 2024)
(ds <- tind(y = 2024, m = rep(1:3, each = 2), d = c(1, 16)))
```


The default format for dates is `%Y-%m-%d` (ISO format, shortcut `%F`).

```{r}
format(ds)
format(ds, format = "%Y-%m-%d")
format(ds, format = "%F")
```


`as.date` (a shortcut for `as.tind(*, type = "d")`) automatically recognizes
this format.

```{r}
as.date("2024-12-31")
```

US format `%m/%d/%y` (shortcut `%D`) is also automatically recognized.

```{r}
as.date("12/31/24")
format(as.date("12/31/24"), "%m/%d/%y")
format(as.date("12/31/24"), "%D")
```

Month names can also be used.
See documentation of `month_names` for discussion of locale settings.

```{r}
(chds <- format(ds, "%b %d, %y", locale = "C"))
as.tind(chds, order = "mdy", locale = "C")
```



## Time of Day

Type `h` (as in *h*our) represents times between midnight (00:00)
and midnight of the next day (24:00).

Time of day can be constructed from `H`, `M`, and `S` arguments passed
to `tind` constructor.

```{r}
tind(H = 0:23)
tind(H = 13, M = (0:3) * 15)
(tod1 <- tind(H = 13, M = 30, S = (0:11) * 5))
```

Sub-second accuracy is also supported.

```{r}
(tod2 <- tind(H = 13, M = 30, S = 17 + (0:9) / 10))
```

As seen above, `tind` automatically determines whether seconds should be
shown or sub-second accuracy is required.

Format specifiers `%H`, `%M`, and `%S` can be used with time of day.

```{r}
format(tod1, "%H:%M")
format(tod1, "%H:%M:%S")
as.tind("13:47", format = "%H:%M")
as.tind("13:47:39", format = "%H:%M:%S")
```


Sub-second accuracy can be explicitly requested via `%OS[0-6]` format specifier.
```{r}
format(tod2, "%H:%M:%S")
format(tod2, "%H:%M:%OS1")
format(tod2, "%H:%M:%OS2")
format(tod2, "%H:%M:%OS3")
```


For parsing, `%OS` (without digits) should be used.
```{r}
as.tind("13:47:39.89", format = "%H:%M:%OS")
```


`H`, `M`, and `S` can be used for order specification. With order specifier
`S` sub-second accuracy is automatically determined.
```{r}
as.tind("13", order = "H")
as.tind("13:47", order = "HM")
as.tind("13:47:39", order = "HMS")
as.tind("13:47:39.9", order = "HMS")
as.tind("13:47:39.89", order = "HMS")
```


12-hour clock can be used with the help of `%I` (hour in 12-hour clock)
and `%p` (ante meridiem, post meridiem) specifiers.


```{r}
format(tind(H = 0:23), "%I %p")
as.tind("9:30am", format = "%I:%M%p")
```


Alternatively, `I` and `p` order specifiers can be used.

```{r}
as.tind("9:30am", order = "IMp")
```


## Date-time

Date-time indices (code `t`) can be constructed from components required
for date and at least hour (`H`) component.

```{r}
tind(y = 2024, m = 8, d = 2, H = 16)
tind(y = 2024, m = 8, d = 2, H = 16, M = (0:3) * 15)
tind(y = 2024, m = 8, d = 2, H = 16, M = 0, S = 10 * (0:5))
```

All date-time indices always have time zone attribute set. For more information,
see documentation of `tzone` function. By default, time zone is set to
system time zone but can be explicitly set using `tz` argument.

```{r}
tind(y = 2024, m = 8, d = 2, H = 16, M = (0:3) * 15, tz = "UTC")
```

Date-time indices can also be constructed from date and time of day indices
using `date_time` function.

```{r}
(dt1 <- date_time(tind(y = 2024, m = 8, d = 2),
                  tind(H = 16, M = (0:3) * 15)))
(dt2 <- date_time(tind(y = 2024, m = 8, d = 2),
                  tind(H = 16, M = (0:3) * 15), tz = "UTC"))
```


Reverse operation can be performed using `date_time_split` function.

```{r}
date_time_split(dt1)
```


As lists can be trivially converted to data frames, if a data frame is desired,
`date_time_split` only has to be wrapped by `as.data.frame`.


```{r}
as.data.frame(date_time_split(dt1))
```

As seen, formatting of date-time indices depends on actual indices (need for seconds
or for subsecond accuracy). Moreover, `as.character` and `format` methods differ.
The former returns UTC offset (or `Z` for UTC), the latter time zone abbreviation.

```{r}
as.character(dt1)
as.character(dt2)
format(dt1)
format(dt2)
```


All format specifiers that can be used with dates and time of day can also be used
with date-time. `%z` specifier represents UTC offset, `%Z` returns time zone
abbreviation.

```{r}
format(dt1, "%F %H:%M")
format(dt1, "%F %H:%M%z")
format(dt1, "%F %H:%M %Z")
format(dt1, "%D %I:%M%p")
```


Standard formats are automatically recognized during parsing.
```{r}
as.tind("2025-02-01 13:03")
as.tind("2025-02-01 13:03:34")
as.tind("2025-02-01 13:03:34.534")
```


When converting to `tind`, either `format` or `order` arguments can be provided.
```{r}
as.tind("2025-02-01 13:03:34.534", format = "%F %H:%M:%OS")
as.tind("02/01/25 01:03:34pm", format = "%D %I:%M:%OS%p")
as.tind("2025-02-01 13:03:34.534", order = "ymdHMS")
as.tind("2025-02-01 13:03:34.534", order = "ymdHMS", tz = "UTC")
as.tind("02/01/25 01:03:34pm", order = "mdyIMSp")
as.tind("02/01/25 01:03:34pm", order = "mdyIMSp", tz = "UTC")
```


The parser recognizes time zone abbreviations:

```{r}
as.tind("2025-02-22 09:54:04 CET", tz = "Europe/Warsaw")
as.tind("2025-08-23 09:54:04 CEST", tz = "Europe/Warsaw")
as.tind("2/22/25 9:54 a.m. EST", order = "mdyIMpz", tz = "America/New_York")
as.tind("8/23/25 9:54 a.m. EDT", order = "mdyIMpz", tz = "America/New_York")
```

When time zone is not provided, the parser tries to guess the time zone:
```{r}
as.tind("2025-02-22 09:54:04 CET")
as.tind("2025-08-23 09:54:04 CEST")
as.tind("2/22/25 9:54 a.m. EST", order = "mdyIMpz")
as.tind("8/23/25 9:54 a.m. EDT", order = "mdyIMpz")
```

When time zone abbreviation can denote different UTC offsets (unfortunately,
this can be the case) `NA`s are introduced with a warning.


## Arbitrary Numeric Indices

For completeness, `tind` supports arbitrary integer indices (code `i`)
and arbitrary numeric indices (code `n`).

```{r}
as.tind(0:9, type = "i")
as.tind(0:9 / 10, type = "n")
```



## Index Conversion

Time index conversion is extremely easy with `tind` class. `as.tind`
method as well as `as.year`, `as.date`, etc. convenience functions can be used for this.

```{r}
ms
as.quarter(ms)
as.year(ms)
as.date(ms)
as.date_time(ms)
as.date_time(ms, tz = "UTC")
```



## Coercion to Base R Types

`as.Date`, `as.POSIXct`, and `as.POSIXlt` can be used to convert time indices
to base R date and date-time classes.

```{r}
ds
as.Date(ds)
dt1
as.POSIXct(dt1)
dt2
as.POSIXlt(dt2)
```



## Matching Periods, Comparisons, `cut` Method

`match_t` function and `%in_t%` operator allow to match time indices
to another set of time indices, possibly of different type (lower resolution).

In the following example, a sequence of dates is matched to months:
```{r}
(x <- as.date("2025-03-02") + 15 * (0:5))
(table <- as.month("2025-03") + -1:1)
match_t(x, table)
```

Below we check which dates fall in March 2025:
```{r}
x %in_t% "2025-03"
```


Comparison operators (e.g., `>`, `>=`) can be used to compare time indices.
Below we check which dates fall in or after April 2025 and before April 2025:
```{r}
x >= "2025-04"
x < "2025-04"
```


`cut` method for object of `tind` class divides time indices into periods.
Using `x` from the last example, we can split dates into months and quarters:
```{r}
cut(x, "m")
cut(x, "q")
```



