Develop advanced web content

submited by
Style Pass
2021-06-11 05:30:05

Develop in JavaScript, WebGL, or WebAssembly? Learn how the latest updates to Safari and WebKit — including language changes to class syntax — can help simplify your development process, enhance performance, and improve security. We'll explore several web APIs that can help provide better interoperability and bring new capabilities to your web content.

♪ Bass music playing ♪ ♪ Sihui Liu: Hello. Welcome to “Develop advanced web content.” I'm Sihui, and I am an engineer on the Safari and WebKit team. I am happy to share with you the important updates we have made in WebKit and Safari for web developers in the past year. The things I'm about to share generally fall into three categories. First, I will walk you through new features and enhancements in JavaScript. Then I will give you an overview of the updates in WebAssembly. And finally, I will introduce you to some new web APIs that can add additional capabilities to your web content. There is a lot to cover, so let’s begin with our news in JavaScript. Each year, there are hundreds of changes made in our JavaScript engine. I will cover some of the most important ones you need know if you work with JavaScript. They are: new class field syntax, weak references that enables smarter memory management, new usage of the await keyword, support for modules in workers, and interfaces added to Internationalization API family. To put you in the picture of these new features, I'm going to use a simple stopwatch as example. The stopwatch has only one button. Click it once, it begins counting. Click it again, it will stop and give you the duration passed. Keep this in mind and we will implement it later in JavaScript. Now, let’s take a look at the new class field syntax. We have new private class fields and methods that let you define real private members whose access is protected by the language. You will see an error if you break the access rules. We also add support for static fields, which allows you to declare a class member that can be accessed without creating an instance of the class. Now you have the basic idea. Let’s check how it can be used with the stopwatch example. If you are asked to implement the stopwatch class, your implementation may look like this. StopwatchWithOneButton has only one method named click(). It checks startTime variable. If start time is unset, the click means to start, so it will set the start time. If start time is already set, the click means to stop, so it will calculate a duration and reset the start time. You can see startTime comes with an underscore prefix. This is a naming convention commonly used to denote that the variable should only be used inside of the class. But that does not actually prevent the start time from being accessed publicly. The new private syntax can help fix this. Just replace the underscore with hash, and you can declare a real private instance field. The encapsulation is enforced by the language. There is also support for private methods. For example, to make click() methods more structured, we can create two private methods start() and stop() to replace the highlighted content, like this. By adding the hash prefix to methods, we make sure the member functions can only be accessed from inside of the class. The new private syntax also applies to static fields like startedStopwatchCount. Here, startedStopwatchCount can only be modified by stopwatch objects at start or stop times. Of course, if you want startedStopwatchCount to be accessed everywhere, you can declare it as public static field, without the hash prefix. Public static field is also available in WebKit now. Private instance fields, methods, private static fields, and public static fields; that’s our new class field support. And let’s continue to another feature, weak references. Weak references allow you to hold reference to a JavaScript object in a way that does not prevent garbage collection. Unlike WeakMap and WeakSet, you can get the underlying object without already having a reference to it. The support also includes notification on garbage collection, so you may perform some cleanup task if you need to. And let’s see how it can be used. We just implemented the stopwatch class. Now imagine you have created multiple stopwatch objects for different tasks. For testing, you need to click all of them at a time. How would you do that? An intuitive way is to keep a set of all stopwatch instances. When the stopwatch is created, add it to the set. Then in clickAllStopwatches function, iterate the set and click each stopwatch. But there is an issue with this approach. We know JavaScript objects hold strong reference by default, so in this case all stopwatch objects cannot be garbage collected because the set still has reference to them. Of course, we don’t want to keep all stopwatch objects around just for testing. This is not great for memory use. Now, you may suggest just replacing Set with WeakSet, but WeakSet is not iterable. So what would you do? We can solve it with the new interface WeakRef, which holds weak references to an object. We still have the set, but this time we add WeakRef of stopwatch objects to the set. In clickAllStopwatches function, we check if the object still exists by dereferencing it before clicking. This seems to solve our issue, but there is another problem: we don’t remove garbage-collected stopwatches from the set in a timely manner, and the set can grow quite large before our next click test. Now what should we do? Another new interface -- FinalizationRegistry -- may help in this case. With it, you can specify a callback to be invoked at when some object is garbage collected. Here we create a finalizationRegistry object with removeStopwatch function, so this function is called every time an object is collected. Then, we register stopwatch objects to the registry. Each stopwatch is bound with an identifier, so removeStopwatch knows which stopwatch to remove. Good, now garbage-collected stopwatches will be removed from allStopwatches. The usage of weak references sounds not that hard, right? But be aware that garbage collection in JavaScript is very complicated, and there is a lot of uncertainty. For example, the object you think should be collected may not actually be collected until a long time after, and you may not get the callback from FinalizationRegistry right way because it runs on event loop. Therefore, make sure you fully understand the syntax and its expected behavior before use. Let’s move on from weak references to the next feature, top-level await. This is a new feature for modules. It enables you to use the await keyword outside of the async function. In this case, the module itself is like a big async function, so an async module can block the execution of the module importing them. Let me show you an example with our stopwatch class. This is the class we just created. To illustrate a use of top-level await, let’s make it a module and export the class. This is an HTML file that contains an inline module. It imports the stopwatch module using dynamic import. The import function returns a promise, so we can use then or catch methods to perform actions after import is done. With top-level await, you can remove the chaining methods and write the code in a synchronous way. This can make your code easier to follow. Also, because imported modules are evaluated at load time, so an async module can block execution of modules depending on it. That means if stopwatch module runs async operations and waits for result, the stopwatch variable here will be initialized after stopwatch module finishes execution. Top-level await has made it easier to do dependency management. But again, this feature is only available in modules, so if the script is not a module, like this...

...you will see a syntax error in Web Inspector. Speaking of modules, there’s another related feature: module workers. Workers has some well-known benefits. It can run scripts in a background thread, so resources can be utilized more efficiently. With this new support, workers now share the benefits of modules, including dynamic import, optimized loading and execution, and dependency management. It is more beneficial and easier for you to move heavy work from main thread to background thread now. Modules is now available in different types of workers, including web worker, service worker, and worklet. To create a module worker, for web workers and service worker, you need to specify type to be module in the options. For a worklet like Audio Worklet, you can use addModule function. It’s quite easy to create a module worker that helps speed up your application. Last in JavaScript section is updates on the Internationalization API. This API provides language-based formatting. It is useful if your web content is built for users in different locales. To show you how it can be used, I built this stopwatch records page because, you know, stopwatch needs to keep up with times and our feature releases. This page shows us details about a single use of stopwatch, including duration, start time, event, participants, and the available languages of the page. Now, let’s dive into each section and take a closer look of each interface. First is NumberFormat. NumberFormat provides language-sensitive number formatting, and it is used to format the duration. Constructor of NumberFormat takes two optional parameters: language and options. Here I set the language to English and I make two options object, which specify different minimum numbers of digits. After creating two NumberFormat objects with language and options, we can use them to format the duration numbers. Here, if number is not milliseconds, I use Format1 to keep two digits; otherwise I use Format2 to keep three digits. As you can see, the format method automatically adds padding zeroes for us. There are many different options you can utilize to create formats you need, such as style, where you can specify value to be currency or unit. Next is DateTimeFormat, which enables language-sensitive date and time formatting. The usage is similar to NumberFormat. First, set the language. Then, set the options. In the options, I set different styles for date and time. The DateTimeFormat object provides a fine-grained configuration that even allows you to specify style for second or millisecond. After that, we can create a DateTimeFormat object with parameters, and use it to format our start time. The result is represented in English. You can see date is more detailed because it has the long style. Next one is Segmenter. It enables you to do language-sensitive string splitting. I used it to find keywords in the event sentence. This is the Chinese version of the stopwatch records page. First, I declare a short list of keywords I want to highlight. The event string even includes a Unicode for the Celsius degree symbol. Here we specify Chinese as language. In options, the granularity is set to be a word. The other possible values are grapheme and sentence. Then we create a Segmenter and use it to split the string with segment method. We can iterate the result objects to get all segments. Check and see if each segment is contained in the keyword list to mark it. Segmenter is quite useful for interpreting languages, like Chinese, where the word boundary is not that obvious. The next is ListFormat, which enables language-sensitive list formatting. The same as before, we can specify language and options. ListFormat does not have as many options as the other interfaces. The most useful ones I found are type and style. With language and options, we can create a ListFormat and format the participants list we have. As you can see, because the type is conjunction and the style is long, the format method adds a comma and a word “and” in the result. The last one is DisplayNames. It provides consistent translation of display names for language, region, and script. Here I specify language to be Japanese. DisplayNames can take language code as input. In the options, we set type as language. Then we can create DisplayNames object. And here, using of method, we can get the translated result. Even though this page is built in English, Japanese users can know what languages are supported. And this is how I built the stopwatch records page with new internationalization interfaces. To refresh your memory, here is the list of things we’ve just looked at in the JavaScript section. Following that, our next stop is updates in WebAssembly. We’ve been shipping our WebAssembly engine for a while, but in case you’re not familiar with it, let me begin by filling in with some background of WebAssembly. WebAssembly is a binary instruction format for a stack-based virtual machine. It is a type of a code that can be run in modern web browsers with performance close to native code. WebAssembly is designed to be a portable compilation target for programming languages like C, C++, or Rust, so WebAssembly can help us deploy applications written in those languages on the web. In most use cases of WebAssembly, it runs alongside with JavaScript. They can communicate with each other through the WebAssembly API. WebAssembly can provide near-native performance, and makes powerful frameworks available on the web. JavaScript can manipulate the DOM and offers powerful web APIs. They can be good additions to each other. A good example of WebAssembly use is Funky Karts. It is a game converted from C++ to WebAssembly with Emscripten. As you can see, it gets to run very smoothly in Safari. This year, we’ve upgraded our WebAssembly engine with following features: new memory instructions that give you better performance on bulk memory operations, like copying or initializing blocks of memory; new instructions to tell user process not to trap on exception, like positive overflow when converting between float and int; new sign-extension operators that let you extend a signed integer; implementation of the latest proposal to convert between WebAssembly type i64 and JavaScript BigInt, which is simpler than previous solutions and can make your code run faster; new reference types that allows WebAssembly modules to hold references to JavaScript and DOM objects, passing them as arguments and storing them; and finally, streaming download and compilation of WebAssembly that shortens overall execution time. These are the highlights of our new WebAssembly features. We hope they will help your development. Now, let's move from powerful low-level code to some high-level APIs. In this section, we are going to explore the new web APIs. My goal is to not only let you know about the new features but also make you feel you are ready to use them, so you will see some good examples. But this will not be a complete tutorial, so remember to check official documentation before use. This is a preview of features I will talk about. Some of them are completely new like Speech Recognition and some of them are already there, but we have some updates we’d like to share, like Storage Access. Now, let’s dig into each of them. We know to make the web content attractive, it’s very important to provide amazing visual experience. With WebGL2 being available in WebKit and Safari, it’s easier for you to create beautiful, interactive web content. Here is a good example of what can be done with WebGL2. After the Flood is an interactive demo developed by PlayCanvas. You can see the gentle wind sways the tree. It looks vivid in Safari. So what is WebGL2? WebGL is a very widely used low-level API for rendering 2D and 3D graphics. WebGL2 is an upgrade of WebGL that eliminates fallbacks and introduces some cool new features. It adds 3D textures to allow rendering volumetric effects like cloud. It has sampler objects that give you more flexibility about how to use textures in shaders. It provides transform feedback that helps you implement performance particle systems on the GPU. There are so many great new features in WebGL2. And more importantly, WebGL2 is now available in Safari on all Apple devices. That means you can build a beautiful site that looks great everywhere. And let’s get more familiar with WebGL2 with an example: creating an orange square. And this is the JavaScript code you need to write for it. If you have not used WebGL before, this may not be as easy as you would have imagined. As I mentioned, because WebGL is a low-level API, it can be very verbose. But don’t worry; there are many great libraries and frameworks that can help simplify your development. With them, it’s not that hard to create a nice square or something more complicated than that. Now, if you already use WebGL in you web content, there is also a good news. We have improved our support by migrating backend from OpenGL to Metal. That means iOS Simulator is now able to use the GPU for web content, making it a much more accurate representation of what your users will see. Also, you can use Metal tools, such as the Xcode Frame Debugger, to analyze your WebGL code now. Besides creating content with WebGL, another common way to provide great visual experience is through video. Not all browsers have the same kind of support for media formats, so sometimes it might be tricky for you to decide which format you are going to use. To make things easier for you, this year, we have increased our support for WebM, a common media format on the web. For a start, the support is only for streaming playback. In macOS 11.3, we added support for playing WebM files containing VP8 or VP9 video and Vorbis audio. And in macOS 12, we add support for files containing Opus audio. Last year, we started supporting WebM played through Media Source Extensions on macOS. Now, we’re bringing that support to iPadOS 15. To check if WebM is supported in your code, you can use MediaCapabilities API, which lets you detect the exact media configuration you want to use. The configuration on the screen is supported in the latest Safari, and that means VP9 is also supported now. With support for this video-coding format, we expect more web content to be available in Safari and WebKit apps. You can use VP9 in both streaming and WebRTC. It works on macOS and iPadOS. Regarding support on different devices, it is available on all Apple silicon Macs. For the others, you can check with MediaCapabilities API, just like what we just saw for WebM. Now, if your site has WebM or VP9 content, I encourage you to check how it works in the latest Safari and WebKit; but if you are still deciding which media format to use, we would recommend H.264 or HEVC. H.264 is mature and well-supported across browsers. HEVC has great support for high-quality videos. They both come with hardware acceleration that can provide smoother playback and longer playback battery life. Talking about hosting video content, a common case is that we don’t own the content; instead, we get it from a third party. For example, I see this nice video on video.domain. To make it appear on my site, main.domain, I can load this video source from video.domain, or I just create iframes of video.domain. For security reasons, third-party iframes or resources do not have access to first-party storage by default. And that means if the resource request for video.domain is initiated from main.domain, it will not include the cookies of video.domain. This can be a problem when web servers of video.domain only want to serve content to authenticated users. And no cookies means no authentication. The Storage Access API solves this issue. It enables third-party iframes to request permission to access first-party cookies. If user grants the permission, the third-party video.domain will be able to access its first-party cookies. The Storage Access API has been available in WebKit and Safari for over three years. To improve interoperability, this year we have added two new features. First, access is granted on a per-page scope. It means once permission is granted for a third party, it is extended to all its subresources on the same page. You don't have to make a request for each iframe. Second, we allow nested iframes to make requests. This means iframes inside of iframes can also request access to first-party cookies, which was not possible before. To learn more about the new usage, please check our blog post “Updates to the Storage Access API” at webkit.org. Now we know how to load or import video content from a third party with user permission if needed. How about creating something on your own? With the new Media Recorder API, it’s very easy to do that. Media Recorder API enables you to capture data from media elements, which includes HTML media elements like video tag or MediaStream objects. You can use it to record from user’s input devices. You can specify desired options, such as the container's MIME type or desired bit rates of tracks. The API is simple. It is comprised of a single major interface, MediaRecorder, which does all the work of collecting the data from source and delivering it to you. Let me show you an example. I used MediaRecorder API to build this web app called "Voice Memo." This is my first voice memo. Click the button, it starts recording from microphone. Click again, it stops recording and offers playback. This is my first voice memo. That is fun. And now let’s check the implementation. We have two major functions: startRecording and stopRecording. In startRecording, we get the input media stream for microphone. Then we create a MediaRecorder object with that. We listen to two events of the media recorder. And then we can start the recorder with start method. To stop recording, we just need to call stop method on mediaRecorder object. Here are the two event handlers. When some captured data is available, we store it in an array. When the recording is stopped, we make a blob with collected data in an array, and send it to an existing audio element for playback. Just like that, you can create a functional voice recorder. After you collect the audio data, you may want to edit it. In this case, you can put the new Audio Worklet API to good use. The Audio Worklet interface is part of Web Audio API, which you may already be familiar with if you have done audio processing on the web before. It allows us to run scripts such as JavaScript or WebAssembly code to process audio on the audio-rendering thread supporting custom AudioNodes. Compared with ScriptProcessorNode the previous solution to run custom script it reduces the hopping between rendering thread and main thread and ensures low latency. With Audio Worklet, I added a new capability to my Voice Memo. This is my distorted voice. If Distortion box is checked for recording, some distortion effect will be applied to the audio. This is my distorted voice That sounds cool, and let’s take a look at how it is implemented. I modified the startRecording function to add audio processing. We still need to get the MediaStream for audio input first. To use Audio Worklet API, there are four basic steps. Step one: create a source. Step two: create an AudioWorkletNode and bind it with an Audio Worklet processor which performs audio processing. The processor is implemented in a module, and we will look at it later. Step three: create a destination. Step four: connect the path from source to destination. This time, MediaRecorder takes the output from AudioWorklet as source and it records the distorted audio. This is the audio-processing module. We implement the DistortionProcessor class here. It must extend the AudioWorkletProcessor class and must provide the implementation for process function. The inputs are the audio samples coming into Audio Worklet, and outputs are the resulting samples after processing. You can use different algorithms to create output. Here, I use a custom function called distort() to calculate a value based on input. Process function returns true, meaning the processor node is active. That’s the basic structure of the process function. After creating the processor class, we need to globally register it under a specified name, so it can be used to construct AudioWorkletNode. Just like that, you can apply sound effects to your audio data. So far we have discussed about producing and processing audio data. How about storing it or sharing it to somewhere else for your record? I guess you don’t want to lose the recording after you quit the browser. With updates to Web Share API this year, it’s quite easy to do that. Web Share is not new in WebKit and Safari. If you choose to share a link on a web page in Safari, a share sheet will show up with sharing targets like Messages, Mail, or AirDrop. The share sheet that matches well with the system style is created with Web Share API. This year, we have added support for file sharing. It means you can share image, video, audio, or other types of file with this API now. Let’s add the sharing capability to Voice Memo. If the Save box is checked, Voice Memo will create an audio file with captured data and display a Share button to let us share the file. Here, I want to share the memo file by email. With just one click, a nice draft is created with the memo file attached. And let’s check the code. This is the stop event handler we saw in the MediaRecorder example. First, let’s make the blob variable in stop event handler global, so it can be used by the share function. The share function is the click event handler of the share button. It converts a blob to a file and gives it a file name. The file is put in an array because that’s the expected input type. Then we check if the API is available and if the file can be shared with canShare method. If check is passed, we call navigator.share with the file array. There are options you can specify, like title and description text. It’s as easy as that to make your web app to share files like a native app does. Well, if you don’t actually want to interact with audio data, but just want the text of it for example, in the case of voice command there is also a new API for you. That’s Speech Recognition. Like its name suggested, Speech Recognition API captures live audio and transcribes it to text. It also gives you probabilities and alternatives of transcript. It uses the same speech engine as Siri, and it gets all the benefits: multiple languages support and great accuracy. That also means your user will need to turn on Siri or Dictation in System Preferences or Settings to make the API available. Recognition can be server based, so we put up a privacy prompt when recognition service is used for the first time in the app. Users can change the permission in System Preferences or Settings. Now, let’s update Voice Memo with this new capability. If the Recognition box is checked, it means generating a transcript for the recording. This is my Voice Memo transcript. Period. And let’s check the code. The usage is a bit like media recorder. Here we have two major functions: startRecognition and stopRecognition. You need to create a webkitSpeechRecognition object first. Yes, we still keep the WebKit prefix for now for compatibility, so don’t forget to add it. Then you can set some properties of the recognition, like continuous, which asks recognition to keep going until it is stopped. We listen to result and end events. With recognition object, we can call start method to start and call stop method to stop. On result event, we collect finalTranscript to a string. Here, I only pick the first item of the results because the transcription alternatives are sorted based on probabilities. When recognition stops, I use a custom log function to print the transcript to the screen. Like that, you can add recognition capability to your web content within just a few lines. It’s been a long journey, and there’s one last web API I think worth mentioning. You may have noticed that on macOS and iOS, the Now Playing widget can show you media states in Safari. It’s convenient, but it usually does not contain much information. For example, this only shows the title of web page; no information about what audio is being played. There is a new web API that can help you improve this situation: the Media Session API. Media Session API lets you communicate media states between web page and other platform components. If you want your user to view or control media states outside of the web page, like in the Now Playing widget, this is the API you need to know. For more details about Media Session API, please check our WWDC session "Coordinate media playback on the web with GroupActivities." And these are the new features we have just explored. I hope you feel you have learned something of it. And your homework today is to implement your own Voice Memo with these new APIs. I'm just kidding, but we do have a few things we hope you can do to help us bring you the best development experience in WebKit and Safari. Please try out the new features in the latest WebKit and Safari and file bug reports at bugs.webkit.org. You can take a sneak peek of new features or features under active development with Safari Technology Preview. If you are interested in web technologies that are used in WebKit or Safari, or interested in joining the WebKit community, webkit.org is a good source. If you want to get fresh updates about WebKit, or if you have any question for us, don’t forget to follow us or tag us on Twitter. Thanks for watching this session, and I hope you have a great time at WWDC! ♪

Leave a Comment