WebVTT is a World Wide Web Consortium standard for displaying timed text in connection with the HTML5<track> element. The early drafts of its specification were written by WHATWG in 2010, after discussions about what caption format should be supported by HTML5, the main options being the relatively mature, XML-based Timed Text Markup Language or an entirely new but more lightweight standard based on the widely-used SubRip format. The final decision was for the new standard, initially called WebSRT. It shared the .srtfile extension and was broadly based on the SubRip format, though not fully compatible with it. The prospective format was later renamed WebVTT. In the January 13, 2011 version of the HTML5 Draft Report], the<track> tag was introduced and the specification was updated to document WebVTT cue text rendering rules. The WebVTT specification is still in draft stage but the basic features are already supported by all major browsers.
CSS in a separate file defined in the companion HTML document for C tags is used instead of the FONT tag
Cue settings allow the customization of cue positioning on the video
Compatibility
Firefox implemented WebVTT in its nightly builds, but initially it was not enabled by default. The feature had to be enabled in Firefox by going to the "about:config" page and setting the value of "media.webvtt.enabled" to true. YouTube began supporting WebVTT in April, 2013. As of July 24, 2014, Mozilla has enabled WebVTT on Firefox by default.
Example of WebVTT format
WEBVTT Kind: captions; Language: en 00:09.000 --> 00:11.000 We are in New York City 00:11.000 --> 00:13.000 We are in New York City 00:13.000 --> 00:16.000 We're actually at the Lucern Hotel, just down the street 00:16.000 --> 00:18.000 from the American Museum of Natural History 00:18.000 --> 00:20.000 And with me is Neil deGrasse Tyson 00:20.000 --> 00:22.000 Astrophysicist, Director of the Hayden Planetarium 00:22.000 --> 00:24.000 at the AMNH. 00:24.000 --> 00:26.000 Thank you for walking down here. 00:27.000 --> 00:30.000 And I want to do a follow-up on the last conversation we did. 00:30.000 --> 00:31.500 align:end size:50% When we e-mailed— 00:30.500 --> 00:32.500 align:start size:50% Didn't we talk about enough in that conversation? 00:32.000 --> 00:35.500 align:end size:50% No! No no no no; 'cos 'cos obviously 'cos 00:32.500 --> 00:33.500 align:start size:50% Laughs 00:35.500 --> 00:38.000 You know I'm so excited my glasses are falling off here.
Unsupported features
In June 2013 an example was added to the specification that included a new "region" setting. As of February 2015, however, no player included support for this feature.