Vendor things

2024-03-08 11:03:01 -08:00 · 2024-03-08 11:03:01 -08:00 · 977e3c17e5
commit 977e3c17e5
parent 5deceec006
19434 changed files with 10682014 additions and 0 deletions
--- a/third-party/vendor/tiny-skia/.cargo-checksum.json
+++ b/third-party/vendor/tiny-skia/.cargo-checksum.json
--- a/third-party/vendor/tiny-skia/CHANGELOG.md
+++ b/third-party/vendor/tiny-skia/CHANGELOG.md
@ -0,0 +1,171 @@
+# Change Log
+All notable changes to this project will be documented in this file.
+
+The format is based on [Keep a Changelog](http://keepachangelog.com/)
+and this project adheres to [Semantic Versioning](http://semver.org/).
+
+## [Unreleased]
+
+## [0.8.4] - 2023-04-22
+### Added
+- Implement `PartialEq` for `Paint` and subtypes. Thanks to [@hecrj](https://github.com/hecrj)
+
+### Changed
+- MSRV bumped to 1.57, mainly because of the `png` crate.
+
+### Fixed
+- `ClipMask`s larger than 8191x8191 pixels.
+  Previously, the creation of a large mask via `ClipMask::set_path`
+  would have created an empty mask.
+
+## [0.8.3] - 2023-02-05
+### Fixed
+- Performance regression, probably due to LLVM update in Rust.
+  Thanks to [@mostafa-khaled775](https://github.com/mostafa-khaled775)
+- Big-endian targets support. Thanks to [@ids1024](https://github.com/ids1024)
+
+## [0.8.2] - 2022-10-22
+### Added
+- `Pixmap::from_vec`.
+
+### Fixed
+- Increase Conic to Quad conversion precision. This allows us to produce nicer round caps.
+  Previously, they were not as round as needed.
+
+## [0.8.1] - 2022-08-29
+### Fixed
+- Conditional compilation of `FasterMinMax` on fallback platforms.
+  Thanks to [@CryZe](https://github.com/CryZe)
+
+## [0.8.0] - 2022-08-27
+### Added
+- AArch64 Neon SIMD support. Up to 3x faster on Apple M1.
+  Thanks to [@CryZe](https://github.com/CryZe)
+
+### Changed
+- `FiniteF32`, `NormalizedF32` and `NonZeroPositiveF32` types have been moved
+  to the `strict-num` crate.
+- Rename `NormalizedF32::from_u8` into `NormalizedF32::new_u8`.
+- Rename `NormalizedF32::new_bounded` into `NormalizedF32::new_clamped`.
+- Use explicit SIMD intrinsic instead of relying on `safe_arch`.
+- MSRV bumped to 1.51
+
+## [0.7.0] - 2022-07-03
+### Added
+- `tiny-skia-path` dependency that can be used independently from `tiny-skia`.
+  It contains the `tiny-skia` Bezier path implementation, including stroking and dashing.
+  As well as all the geometry primitives (like `Point` and `Rect`).
+
+### Changed
+- When disabling the `std` feature, one have to enable `no-std-float` feature instead of `libm` now.
+
+## [0.6.6] - 2022-06-23
+### Fixed
+- Panic in `Rect::round` and `Rect::round_out`.
+  Thanks to [@Wardenfar](https://github.com/Wardenfar)
+
+## [0.6.5] - 2022-06-10
+### Fixed
+- Minimum `arrayref` version.
+
+## [0.6.4] - 2022-06-04
+### Fixed
+- Panic during non-aliased hairline stroking at the bottom edge of an image.
+
+## [0.6.3] - 2022-02-01
+### Fixed
+- SourceOver blend mode must not be optimized to Source when ClipPath is present.
+
+## [0.6.2] - 2021-12-30
+### Fixed
+- `ClipMask::intersect_path` alpha multiplying.
+
+## [0.6.1] - 2021-08-28
+### Added
+- Support rendering on pixmaps larger than 8191x8191 pixels.
+  From now, `Pixmap` is limited only by the amount of memory caller has.
+- `Transform::map_points`
+- `PathBuilder::push_oval`
+
+## [0.6.0] - 2021-08-21
+### Added
+- WASM simd128 support. Thanks to [@CryZe](https://github.com/CryZe)
+
+### Changed
+- `Transform::post_scale` no longer requires `&mut self`.
+- Update `png` crate.
+
+## [0.5.1] - 2021-03-07
+### Fixed
+- Color memset optimizations should be ignored when clip mask is present.
+- `ClipMask::intersect_path` logic.
+
+## [0.5.0] - 2021-03-06
+### Added
+- `ClipMask::intersect_path`
+- no_std support. Thanks to [@CryZe](https://github.com/CryZe)
+
+### Changed
+- Reduce `Transform` strictness. It's no longer guarantee to have only finite values,
+  therefore we don't have to check each operation.
+
+### Removed
+- `Canvas`. Call `Pixmap`/`PixmapMut` drawing methods directly.
+
+## [0.4.2] - 2021-01-23
+### Fixed
+- Panic during path filling with anti-aliasing because of incorrect edges processing.
+
+## [0.4.1] - 2021-01-19
+### Fixed
+- Endless loop during stroke dashing.
+
+## [0.4.0] - 2021-01-02
+### Changed
+- Remove almost all `unsafe`. No performance changes.
+
+## [0.3.0] - 2020-12-20
+### Added
+- `PixmapRef` and `PixmapMut`, that can be created from `Pixmap` or from raw data.
+- `Canvas::set_clip_mask`, `Canvas::get_clip_mask`, `Canvas::take_clip_mask`.
+
+### Changed
+- `Canvas` no longer owns a `Pixmap`.
+- `Canvas::draw_pixmap` and `Pattern::new` accept `PixmapRef` instead of `&Pixmap` now.
+- Improve clipping performance.
+- The internal `ClipMask` type become public.
+
+### Fixed
+- Panic when path is drawn slightly past the `Pixmap` bounds.
+
+### Removed
+- `Canvas::new`
+
+## 0.2.0 - 2020-11-16
+### Changed
+- Port to Rust.
+
+## 0.1.0 - 2020-07-04
+### Added
+- Bindings to a stripped down Skia fork.
+
+[Unreleased]: https://github.com/RazrFalcon/tiny-skia/compare/v0.8.4...HEAD
+[0.8.4]: https://github.com/RazrFalcon/tiny-skia/compare/v0.8.3...v0.8.4
+[0.8.3]: https://github.com/RazrFalcon/tiny-skia/compare/v0.8.2...v0.8.3
+[0.8.2]: https://github.com/RazrFalcon/tiny-skia/compare/v0.8.1...v0.8.2
+[0.8.1]: https://github.com/RazrFalcon/tiny-skia/compare/v0.8.0...v0.8.1
+[0.8.0]: https://github.com/RazrFalcon/tiny-skia/compare/v0.7.0...v0.8.0
+[0.7.0]: https://github.com/RazrFalcon/tiny-skia/compare/v0.6.6...v0.7.0
+[0.6.6]: https://github.com/RazrFalcon/tiny-skia/compare/v0.6.5...v0.6.6
+[0.6.5]: https://github.com/RazrFalcon/tiny-skia/compare/v0.6.4...v0.6.5
+[0.6.4]: https://github.com/RazrFalcon/tiny-skia/compare/v0.6.3...v0.6.4
+[0.6.3]: https://github.com/RazrFalcon/tiny-skia/compare/v0.6.2...v0.6.3
+[0.6.2]: https://github.com/RazrFalcon/tiny-skia/compare/v0.6.1...v0.6.2
+[0.6.1]: https://github.com/RazrFalcon/tiny-skia/compare/v0.6.0...v0.6.1
+[0.6.0]: https://github.com/RazrFalcon/tiny-skia/compare/v0.5.1...v0.6.0
+[0.5.1]: https://github.com/RazrFalcon/tiny-skia/compare/v0.5.0...v0.5.1
+[0.5.0]: https://github.com/RazrFalcon/tiny-skia/compare/v0.4.2...v0.5.0
+[0.4.2]: https://github.com/RazrFalcon/tiny-skia/compare/v0.4.1...v0.4.2
+[0.4.1]: https://github.com/RazrFalcon/tiny-skia/compare/v0.4.0...v0.4.1
+[0.4.0]: https://github.com/RazrFalcon/tiny-skia/compare/v0.3.0...v0.4.0
+[0.3.0]: https://github.com/RazrFalcon/tiny-skia/compare/v0.2.0...v0.3.0
--- a/third-party/vendor/tiny-skia/Cargo.lock
+++ b/third-party/vendor/tiny-skia/Cargo.lock
@ -0,0 +1,115 @@
+# This file is automatically @generated by Cargo.
+# It is not intended for manual editing.
+version = 3
+
+[[package]]
+name = "adler"
+version = "1.0.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "f26201604c87b1e01bd3d98f8d5d9a8fcbb815e8cedb41ffccbeb4bf593a35fe"
+
+[[package]]
+name = "arrayref"
+version = "0.3.6"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "a4c527152e37cf757a3f78aae5a06fbeefdb07ccc535c980a3208ee3060dd544"
+
+[[package]]
+name = "arrayvec"
+version = "0.7.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "8da52d66c7071e2e3fa2a1e5c6d088fec47b593032b254f5e980de8ea54454d6"
+
+[[package]]
+name = "bitflags"
+version = "1.3.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "bef38d45163c2f1dde094a7dfd33ccf595c92905c8f8f4fdc18d06fb1037718a"
+
+[[package]]
+name = "bytemuck"
+version = "1.12.3"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "aaa3a8d9a1ca92e282c96a32d6511b695d7d994d1d102ba85d279f9b2756947f"
+
+[[package]]
+name = "cfg-if"
+version = "1.0.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd"
+
+[[package]]
+name = "crc32fast"
+version = "1.3.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "b540bd8bc810d3885c6ea91e2018302f68baba2129ab3e88f32389ee9370880d"
+dependencies = [
+ "cfg-if",
+]
+
+[[package]]
+name = "flate2"
+version = "1.0.25"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "a8a2db397cb1c8772f31494cb8917e48cd1e64f0fa7efac59fbd741a0a8ce841"
+dependencies = [
+ "crc32fast",
+ "miniz_oxide",
+]
+
+[[package]]
+name = "libm"
+version = "0.2.6"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "348108ab3fba42ec82ff6e9564fc4ca0247bdccdc68dd8af9764bbc79c3c8ffb"
+
+[[package]]
+name = "miniz_oxide"
+version = "0.6.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "b275950c28b37e794e8c55d88aeb5e139d0ce23fdbbeda68f8d7174abdf9e8fa"
+dependencies = [
+ "adler",
+]
+
+[[package]]
+name = "png"
+version = "0.17.7"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "5d708eaf860a19b19ce538740d2b4bdeeb8337fa53f7738455e706623ad5c638"
+dependencies = [
+ "bitflags",
+ "crc32fast",
+ "flate2",
+ "miniz_oxide",
+]
+
+[[package]]
+name = "strict-num"
+version = "0.1.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "9df65f20698aeed245efdde3628a6b559ea1239bbb871af1b6e3b58c413b2bd1"
+
+[[package]]
+name = "tiny-skia"
+version = "0.8.4"
+dependencies = [
+ "arrayref",
+ "arrayvec",
+ "bytemuck",
+ "cfg-if",
+ "png",
+ "tiny-skia-path",
+]
+
+[[package]]
+name = "tiny-skia-path"
+version = "0.8.4"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "adbfb5d3f3dd57a0e11d12f4f13d4ebbbc1b5c15b7ab0a156d030b21da5f677c"
+dependencies = [
+ "arrayref",
+ "bytemuck",
+ "libm",
+ "strict-num",
+]
--- a/third-party/vendor/tiny-skia/Cargo.toml
+++ b/third-party/vendor/tiny-skia/Cargo.toml
@ -0,0 +1,63 @@
+# THIS FILE IS AUTOMATICALLY GENERATED BY CARGO
+#
+# When uploading crates to the registry Cargo will automatically
+# "normalize" Cargo.toml files for maximal compatibility
+# with all versions of Cargo and also rewrite `path` dependencies
+# to registry (e.g., crates.io) dependencies.
+#
+# If you are reading this file be aware that the original Cargo.toml
+# will likely look very different (and much more reasonable).
+# See Cargo.toml.orig for the original contents.
+
+[package]
+edition = "2018"
+name = "tiny-skia"
+version = "0.8.4"
+authors = ["Yevhenii Reizner <razrfalcon@gmail.com>"]
+description = "A tiny Skia subset ported to Rust."
+documentation = "https://docs.rs/tiny-skia/"
+readme = "README.md"
+keywords = [
+    "2d",
+    "rendering",
+    "skia",
+]
+categories = ["rendering"]
+license = "BSD-3-Clause"
+repository = "https://github.com/RazrFalcon/tiny-skia"
+
+[dependencies.arrayref]
+version = "0.3.6"
+
+[dependencies.arrayvec]
+version = "0.7"
+default-features = false
+
+[dependencies.bytemuck]
+version = "1.12"
+features = ["aarch64_simd"]
+
+[dependencies.cfg-if]
+version = "1"
+
+[dependencies.png]
+version = "0.17"
+optional = true
+
+[dependencies.tiny-skia-path]
+version = "0.8.4"
+default-features = false
+
+[features]
+default = [
+    "std",
+    "simd",
+    "png-format",
+]
+no-std-float = ["tiny-skia-path/no-std-float"]
+png-format = [
+    "std",
+    "png",
+]
+simd = []
+std = ["tiny-skia-path/std"]
--- a/third-party/vendor/tiny-skia/LICENSE
+++ b/third-party/vendor/tiny-skia/LICENSE
@ -0,0 +1,30 @@
+Copyright (c) 2011 Google Inc. All rights reserved.
+Copyright (c) 2020 Yevhenii Reizner All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+  * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+
+  * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+
+  * Neither the name of the copyright holder nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--- a/third-party/vendor/tiny-skia/README.md
+++ b/third-party/vendor/tiny-skia/README.md
@ -0,0 +1,139 @@
+# tiny-skia
+![Build Status](https://github.com/RazrFalcon/tiny-skia/workflows/Rust/badge.svg)
+[![Crates.io](https://img.shields.io/crates/v/tiny-skia.svg)](https://crates.io/crates/tiny-skia)
+[![Documentation](https://docs.rs/tiny-skia/badge.svg)](https://docs.rs/tiny-skia)
+[![Rust 1.57+](https://img.shields.io/badge/rust-1.57+-orange.svg)](https://www.rust-lang.org)
+
+`tiny-skia` is a tiny [Skia] subset ported to Rust.
+
+The goal is to provide an absolute minimal, CPU only, 2D rendering library for the Rust ecosystem,
+with a focus on a rendering quality, speed and binary size.
+
+And while `tiny-skia` is definitely tiny, it support all the common 2D operations
+like: filling and stroking a shape with a solid color, gradient or pattern;
+stroke dashing; clipping; images blending; PNG load/save.
+The main missing feature is text rendering
+(see [#1](https://github.com/RazrFalcon/tiny-skia/issues/1)).
+
+**Note:** this is not a Skia replacement and never will be. It's more of a research project.
+
+## Motivation
+
+The main motivation behind this library is to have a small, high-quality 2D rendering
+library that can be used by [resvg]. And the choice is rather limited.
+You basically have to choose between [cairo], Qt and Skia. And all of them are
+relatively bloated, hard to compile and distribute. Not to mention that none of them
+are written in Rust.
+
+But if we ignore those problems and focus only on quality and speed alone,
+Skia is by far the best one.
+However, the main problem with Skia is that it's huge. Really huge.
+It supports CPU and GPU rendering, multiple input and output formats (including SVG and PDF),
+various filters, color spaces, color types and text rendering.
+It consists of 370 KLOC without dependencies (around 7 MLOC with dependencies)
+and requires around 4-8 GiB of disk space to be built from sources.
+And the final binary is 3-8 MiB big, depending on enabled features.
+Not to mention that it requires `clang` and no other compiler
+and uses an obscure build system (`gn`) which was using Python2 until recently.
+
+`tiny-skia` tries to be small, simple and easy to build.
+Currently, it has around 14 KLOC, compiles in less than 5s on a modern CPU
+and adds around 200KiB to your binary.
+
+## Performance
+
+Currently, `tiny-skia` is 20-100% slower than Skia on x86-64 and about 100-300% slower on ARM.
+Which is still faster than [cairo] and [raqote] in many cases.
+See benchmark results [here](https://razrfalcon.github.io/tiny-skia/x86_64.html).
+
+The heart of Skia's CPU rendering is
+[SkRasterPipeline](https://github.com/google/skia/blob/master/src/opts/SkRasterPipeline_opts.h).
+And this is an extremely optimized piece of code.
+But to be a bit pedantic, it's not really a C++ code. It relies on clang's
+non-standard vector extensions, which means that it works only with clang.
+You can actually build it with gcc/msvc, but it will simply ignore all the optimizations
+and become 15-30 *times* slower! Which makes it kinda useless.
+
+Also note, that neither Skia or `tiny-skia` are supporting dynamic CPU detection,
+so by enabling newer instructions you're making the resulting binary non-portable.
+
+Essentially, you will get a decent performance on x86 targets by default.
+But if you are looking for an even better performance, you should compile your application
+with `RUSTFLAGS="-Ctarget-cpu=haswell"` environment variable to enable AVX instructions.
+
+We support ARM AArch64 NEON as well and there is no need to pass any additional flags.
+
+You can find more information in [benches/README.md](./benches/README.md).
+
+## Rendering quality
+
+Unless there is a bug, `tiny-skia` must produce exactly the same results as Skia.
+
+## Safety
+
+While a quick search would shown tons of `unsafe`, the library is actually fully safe.
+All pixels access is bound-checked. And all memory-related operations are safe.
+
+We must use `unsafe` to call SIMD intrinsics, which is perfectly safe,
+but Rust's std still marks them as `unsafe` because they may be missing on the target CPU.
+We do check for that.
+
+We also have to mark some types (to cast `[u32; 1]` to `[u8; 4]` and vise-versa) as
+[bytemuck::Pod](https://docs.rs/bytemuck/1.4.1/bytemuck/trait.Pod.html),
+which is an `unsafe` trait, but still is perfectly safe.
+
+## Out of scope
+
+Skia is a huge library and we support only a tiny part of.
+And more importantly, we do not plan to support many feature at all.
+
+- GPU rendering.
+- Text rendering (maybe someday).
+- PDF generation.
+- Non-RGBA8888 images.
+- Non-PNG image formats.
+- Advanced Bézier path operations.
+- Conic path segments.
+- Path effects (except dashing).
+- Any kind of resource caching.
+- ICC profiles.
+
+## Notable changes
+
+Despite being a port, we still have a lot of changes even in the supported subset.
+
+- No global alpha.<br/>
+  Unlike Skia, only `Pattern` is allowed to have opacity.
+  In all other cases you should adjust colors opacity manually.
+- No bilinear + mipmap down-scaling support.
+- `tiny-skia` uses just a simple alpha mask for clipping, while Skia has a very complicated,
+but way faster algorithm.
+
+## Notes about the port
+
+`tiny-skia` should be viewed as a Rust 2D rendering library that uses Skia algorithms internally.
+We have a completely different public API. The internals are also extremely simplified.
+But all the core logic and math is borrowed from Skia. Hence the name.
+
+As for the porting process itself, Skia uses goto, inheritance, virtual methods, linked lists,
+const generics and templates specialization a lot, and all of this features are unavailable in Rust.
+There are also a lot of pointers magic, implicit mutations and caches.
+Therefore we have to compromise or even rewrite some parts from scratch.
+
+## Alternatives
+
+Right now, the only pure Rust alternative is [raqote].
+
+- It doesn't support high-quality antialiasing (hairline stroking in particular).
+- It's very slow (see [benchmarks](./benches/README.md)).
+- There are some rendering issues (like gradient transparency).
+- Raqote has a very rudimentary text rendering support, while tiny-skia has none.
+
+## License
+
+The same as used by [Skia]: [New BSD License](./LICENSE)
+
+[Skia]: https://skia.org/
+[cairo]: https://www.cairographics.org/
+[raqote]: https://github.com/jrmuizel/raqote
+[resvg]: https://github.com/RazrFalcon/resvg
--- a/third-party/vendor/tiny-skia/examples/clip_path.rs
+++ b/third-party/vendor/tiny-skia/examples/clip_path.rs
@ -0,0 +1,29 @@
+use tiny_skia::*;
+
+fn main() {
+    let clip_path = {
+        let mut pb = PathBuilder::new();
+        pb.push_circle(250.0, 250.0, 200.0);
+        pb.push_circle(250.0, 250.0, 100.0);
+        pb.finish().unwrap()
+    };
+
+    let clip_path = clip_path
+        .transform(Transform::from_row(1.0, -0.3, 0.0, 1.0, 0.0, 75.0))
+        .unwrap();
+
+    let mut clip_mask = ClipMask::new();
+    clip_mask.set_path(500, 500, &clip_path, FillRule::EvenOdd, true);
+
+    let mut paint = Paint::default();
+    paint.set_color_rgba8(50, 127, 150, 200);
+
+    let mut pixmap = Pixmap::new(500, 500).unwrap();
+    pixmap.fill_rect(
+        Rect::from_xywh(0.0, 0.0, 500.0, 500.0).unwrap(),
+        &paint,
+        Transform::identity(),
+        Some(&clip_mask),
+    );
+    pixmap.save_png("image.png").unwrap();
+}
--- a/third-party/vendor/tiny-skia/examples/fill.rs
+++ b/third-party/vendor/tiny-skia/examples/fill.rs
@ -0,0 +1,47 @@
+use tiny_skia::*;
+
+fn main() {
+    let mut paint1 = Paint::default();
+    paint1.set_color_rgba8(50, 127, 150, 200);
+    paint1.anti_alias = true;
+
+    let mut paint2 = Paint::default();
+    paint2.set_color_rgba8(220, 140, 75, 180);
+
+    let path1 = {
+        let mut pb = PathBuilder::new();
+        pb.move_to(60.0, 60.0);
+        pb.line_to(160.0, 940.0);
+        pb.cubic_to(380.0, 840.0, 660.0, 800.0, 940.0, 800.0);
+        pb.cubic_to(740.0, 460.0, 440.0, 160.0, 60.0, 60.0);
+        pb.close();
+        pb.finish().unwrap()
+    };
+
+    let path2 = {
+        let mut pb = PathBuilder::new();
+        pb.move_to(940.0, 60.0);
+        pb.line_to(840.0, 940.0);
+        pb.cubic_to(620.0, 840.0, 340.0, 800.0, 60.0, 800.0);
+        pb.cubic_to(260.0, 460.0, 560.0, 160.0, 940.0, 60.0);
+        pb.close();
+        pb.finish().unwrap()
+    };
+
+    let mut pixmap = Pixmap::new(1000, 1000).unwrap();
+    pixmap.fill_path(
+        &path1,
+        &paint1,
+        FillRule::Winding,
+        Transform::identity(),
+        None,
+    );
+    pixmap.fill_path(
+        &path2,
+        &paint2,
+        FillRule::Winding,
+        Transform::identity(),
+        None,
+    );
+    pixmap.save_png("image.png").unwrap();
+}
--- a/third-party/vendor/tiny-skia/examples/hairline.rs
+++ b/third-party/vendor/tiny-skia/examples/hairline.rs
@ -0,0 +1,25 @@
+use tiny_skia::*;
+
+// This example demonstrates thin paths rendering.
+
+fn main() {
+    let mut pb = PathBuilder::new();
+    pb.move_to(50.0, 100.0);
+    pb.cubic_to(130.0, 20.0, 390.0, 120.0, 450.0, 30.0);
+    let path = pb.finish().unwrap();
+
+    let mut paint = Paint::default();
+    paint.set_color_rgba8(50, 127, 150, 200);
+    paint.anti_alias = true;
+
+    let mut pixmap = Pixmap::new(500, 500).unwrap();
+    let mut transform = Transform::identity();
+    for i in 0..20 {
+        let mut stroke = Stroke::default();
+        stroke.width = 2.0 - (i as f32 / 10.0);
+        pixmap.stroke_path(&path, &paint, &stroke, transform, None);
+        transform = transform.pre_translate(0.0, 20.0);
+    }
+
+    pixmap.save_png("image.png").unwrap();
+}
--- a/third-party/vendor/tiny-skia/examples/image_on_image.rs
+++ b/third-party/vendor/tiny-skia/examples/image_on_image.rs
@ -0,0 +1,59 @@
+use tiny_skia::*;
+
+fn main() {
+    let triangle = create_triangle();
+
+    let mut pixmap = Pixmap::new(400, 400).unwrap();
+
+    let now = std::time::Instant::now();
+
+    let mut paint = PixmapPaint::default();
+    paint.quality = FilterQuality::Bicubic;
+
+    pixmap.draw_pixmap(
+        20,
+        20,
+        triangle.as_ref(),
+        &paint,
+        Transform::from_row(1.2, 0.5, 0.5, 1.2, 0.0, 0.0),
+        None,
+    );
+
+    println!(
+        "Rendered in {:.2}ms",
+        now.elapsed().as_micros() as f64 / 1000.0
+    );
+
+    pixmap.save_png("image.png").unwrap();
+}
+
+fn create_triangle() -> Pixmap {
+    let mut paint = Paint::default();
+    paint.set_color_rgba8(50, 127, 150, 200);
+    paint.anti_alias = true;
+
+    let mut pb = PathBuilder::new();
+    pb.move_to(0.0, 200.0);
+    pb.line_to(200.0, 200.0);
+    pb.line_to(100.0, 0.0);
+    pb.close();
+    let path = pb.finish().unwrap();
+
+    let mut pixmap = Pixmap::new(200, 200).unwrap();
+
+    pixmap.fill_path(
+        &path,
+        &paint,
+        FillRule::Winding,
+        Transform::identity(),
+        None,
+    );
+
+    let path = PathBuilder::from_rect(Rect::from_ltrb(0.0, 0.0, 200.0, 200.0).unwrap());
+    let stroke = Stroke::default();
+    paint.set_color_rgba8(200, 0, 0, 220);
+
+    pixmap.stroke_path(&path, &paint, &stroke, Transform::identity(), None); // TODO: stroke_rect
+
+    pixmap
+}
--- a/third-party/vendor/tiny-skia/examples/large_image.rs
+++ b/third-party/vendor/tiny-skia/examples/large_image.rs
@ -0,0 +1,73 @@
+use tiny_skia::*;
+
+// This example will crate a 20_000x20_000px image, which can take a while in a debug mode.
+// This example is used mainly to tests that our tiling algorithm actually works and doesn't panic.
+
+fn main() {
+    let path1 = {
+        let mut pb = PathBuilder::new();
+        pb.move_to(1200.0, 1200.0);
+        pb.line_to(3200.0, 18800.0);
+        pb.cubic_to(7600.0, 16800.0, 13200.0, 16000.0, 18800.0, 16000.0);
+        pb.cubic_to(14800.0, 9200.0, 8800.0, 3200.0, 1200.0, 1200.0);
+        pb.close();
+        pb.finish().unwrap()
+    };
+
+    let path2 = {
+        let mut pb = PathBuilder::new();
+        pb.move_to(18800.0, 1200.0);
+        pb.line_to(16800.0, 18800.0);
+        pb.cubic_to(12400.0, 16800.0, 6800.0, 16000.0, 1200.0, 16000.0);
+        pb.cubic_to(5200.0, 9200.0, 11200.0, 3200.0, 18800.0, 1200.0);
+        pb.close();
+        pb.finish().unwrap()
+    };
+
+    let mut pixmap = Pixmap::new(20000, 20000).unwrap();
+
+    let clip_path = {
+        let mut pb = PathBuilder::new();
+        pb.push_circle(10000.0, 10000.0, 7000.0);
+        pb.finish().unwrap()
+    };
+
+    let mut clip = ClipMask::new();
+    clip.set_path(20000, 20000, &clip_path, FillRule::Winding, true);
+
+    let mut paint = Paint::default();
+    paint.set_color_rgba8(90, 175, 100, 150);
+    paint.anti_alias = true;
+    let large_rect = Rect::from_xywh(500.0, 500.0, 19000.0, 19000.0).unwrap();
+    pixmap
+        .fill_rect(large_rect, &paint, Transform::identity(), None)
+        .unwrap();
+
+    paint.set_color_rgba8(50, 127, 150, 200);
+    paint.anti_alias = true;
+    pixmap.fill_path(
+        &path1,
+        &paint,
+        FillRule::Winding,
+        Transform::default(),
+        Some(&clip),
+    );
+
+    paint.set_color_rgba8(220, 140, 75, 180);
+    paint.anti_alias = false;
+    pixmap.fill_path(
+        &path2,
+        &paint,
+        FillRule::Winding,
+        Transform::default(),
+        None,
+    );
+
+    paint.set_color_rgba8(255, 10, 15, 180);
+    paint.anti_alias = true;
+    let mut stroke = Stroke::default();
+    stroke.width = 0.8; // hairline
+    pixmap.stroke_path(&path2, &paint, &stroke, Transform::default(), None);
+
+    pixmap.save_png("image.png").unwrap();
+}
--- a/third-party/vendor/tiny-skia/examples/linear_gradient.rs
+++ b/third-party/vendor/tiny-skia/examples/linear_gradient.rs
@ -0,0 +1,34 @@
+use tiny_skia::*;
+
+fn main() {
+    let mut paint = Paint::default();
+    paint.shader = LinearGradient::new(
+        Point::from_xy(100.0, 100.0),
+        Point::from_xy(900.0, 900.0),
+        vec![
+            GradientStop::new(0.0, Color::from_rgba8(50, 127, 150, 200)),
+            GradientStop::new(1.0, Color::from_rgba8(220, 140, 75, 180)),
+        ],
+        SpreadMode::Pad,
+        Transform::identity(),
+    )
+    .unwrap();
+
+    let mut pb = PathBuilder::new();
+    pb.move_to(60.0, 60.0);
+    pb.line_to(160.0, 940.0);
+    pb.cubic_to(380.0, 840.0, 660.0, 800.0, 940.0, 800.0);
+    pb.cubic_to(740.0, 460.0, 440.0, 160.0, 60.0, 60.0);
+    pb.close();
+    let path = pb.finish().unwrap();
+
+    let mut pixmap = Pixmap::new(1000, 1000).unwrap();
+    pixmap.fill_path(
+        &path,
+        &paint,
+        FillRule::Winding,
+        Transform::identity(),
+        None,
+    );
+    pixmap.save_png("image.png").unwrap();
+}
--- a/third-party/vendor/tiny-skia/examples/pattern.rs
+++ b/third-party/vendor/tiny-skia/examples/pattern.rs
@ -0,0 +1,50 @@
+use tiny_skia::*;
+
+fn main() {
+    let triangle = crate_triangle();
+
+    let mut paint = Paint::default();
+    paint.anti_alias = true;
+    paint.shader = Pattern::new(
+        triangle.as_ref(),
+        SpreadMode::Repeat,
+        FilterQuality::Bicubic,
+        1.0,
+        Transform::from_row(1.5, -0.4, 0.0, -0.8, 5.0, 1.0),
+    );
+
+    let path = PathBuilder::from_circle(200.0, 200.0, 180.0).unwrap();
+
+    let mut pixmap = Pixmap::new(400, 400).unwrap();
+    pixmap.fill_path(
+        &path,
+        &paint,
+        FillRule::Winding,
+        Transform::identity(),
+        None,
+    );
+    pixmap.save_png("image.png").unwrap();
+}
+
+fn crate_triangle() -> Pixmap {
+    let mut paint = Paint::default();
+    paint.set_color_rgba8(50, 127, 150, 200);
+    paint.anti_alias = true;
+
+    let mut pb = PathBuilder::new();
+    pb.move_to(0.0, 20.0);
+    pb.line_to(20.0, 20.0);
+    pb.line_to(10.0, 0.0);
+    pb.close();
+    let path = pb.finish().unwrap();
+
+    let mut pixmap = Pixmap::new(20, 20).unwrap();
+    pixmap.fill_path(
+        &path,
+        &paint,
+        FillRule::Winding,
+        Transform::identity(),
+        None,
+    );
+    pixmap
+}
--- a/third-party/vendor/tiny-skia/examples/stroke.rs
+++ b/third-party/vendor/tiny-skia/examples/stroke.rs
@ -0,0 +1,30 @@
+use tiny_skia::*;
+
+// Based on https://fiddle.skia.org/c/@compose_path
+
+fn main() {
+    let mut paint = Paint::default();
+    paint.set_color_rgba8(0, 127, 0, 200);
+    paint.anti_alias = true;
+
+    let path = {
+        let mut pb = PathBuilder::new();
+        const RADIUS: f32 = 250.0;
+        const CENTER: f32 = 250.0;
+        pb.move_to(CENTER + RADIUS, CENTER);
+        for i in 1..8 {
+            let a = 2.6927937 * i as f32;
+            pb.line_to(CENTER + RADIUS * a.cos(), CENTER + RADIUS * a.sin());
+        }
+        pb.finish().unwrap()
+    };
+
+    let mut stroke = Stroke::default();
+    stroke.width = 6.0;
+    stroke.line_cap = LineCap::Round;
+    stroke.dash = StrokeDash::new(vec![20.0, 40.0], 0.0);
+
+    let mut pixmap = Pixmap::new(500, 500).unwrap();
+    pixmap.stroke_path(&path, &paint, &stroke, Transform::identity(), None);
+    pixmap.save_png("image.png").unwrap();
+}
--- a/third-party/vendor/tiny-skia/src/alpha_runs.rs
+++ b/third-party/vendor/tiny-skia/src/alpha_runs.rs
@ -0,0 +1,233 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use alloc::vec;
+use alloc::vec::Vec;
+use core::convert::TryFrom;
+use core::num::NonZeroU16;
+
+use crate::color::AlphaU8;
+use crate::LengthU32;
+
+pub type AlphaRun = Option<NonZeroU16>;
+
+/// Sparse array of run-length-encoded alpha (supersampling coverage) values.
+///
+/// Sparseness allows us to independently compose several paths into the
+/// same AlphaRuns buffer.
+pub struct AlphaRuns {
+    pub runs: Vec<AlphaRun>,
+    pub alpha: Vec<u8>,
+}
+
+impl AlphaRuns {
+    pub fn new(width: LengthU32) -> Self {
+        let mut runs = AlphaRuns {
+            runs: vec![None; (width.get() + 1) as usize],
+            alpha: vec![0; (width.get() + 1) as usize],
+        };
+        runs.reset(width);
+        runs
+    }
+
+    /// Returns 0-255 given 0-256.
+    pub fn catch_overflow(alpha: u16) -> AlphaU8 {
+        debug_assert!(alpha <= 256);
+        (alpha - (alpha >> 8)) as u8
+    }
+
+    /// Returns true if the scanline contains only a single run, of alpha value 0.
+    pub fn is_empty(&self) -> bool {
+        debug_assert!(self.runs[0].is_some());
+        match self.runs[0] {
+            Some(run) => self.alpha[0] == 0 && self.runs[usize::from(run.get())].is_none(),
+            None => true,
+        }
+    }
+
+    /// Reinitialize for a new scanline.
+    pub fn reset(&mut self, width: LengthU32) {
+        let run = u16::try_from(width.get()).unwrap();
+        self.runs[0] = NonZeroU16::new(run);
+        self.runs[width.get() as usize] = None;
+        self.alpha[0] = 0;
+    }
+
+    /// Insert into the buffer a run starting at (x-offset_x).
+    ///
+    /// if start_alpha > 0
+    ///     one pixel with value += start_alpha,
+    ///         max 255
+    /// if middle_count > 0
+    ///     middle_count pixels with value += max_value
+    /// if stop_alpha > 0
+    ///     one pixel with value += stop_alpha
+    ///
+    /// Returns the offset_x value that should be passed on the next call,
+    /// assuming we're on the same scanline. If the caller is switching
+    /// scanlines, then offset_x should be 0 when this is called.
+    pub fn add(
+        &mut self,
+        x: u32,
+        start_alpha: AlphaU8,
+        mut middle_count: usize,
+        stop_alpha: AlphaU8,
+        max_value: u8,
+        offset_x: usize,
+    ) -> usize {
+        let mut x = x as usize;
+
+        let mut runs_offset = offset_x;
+        let mut alpha_offset = offset_x;
+        let mut last_alpha_offset = offset_x;
+        x -= offset_x;
+
+        if start_alpha != 0 {
+            Self::break_run(
+                &mut self.runs[runs_offset..],
+                &mut self.alpha[alpha_offset..],
+                x,
+                1,
+            );
+            // I should be able to just add alpha[x] + start_alpha.
+            // However, if the trailing edge of the previous span and the leading
+            // edge of the current span round to the same super-sampled x value,
+            // I might overflow to 256 with this add, hence the funny subtract (crud).
+            let tmp = u16::from(self.alpha[alpha_offset + x]) + u16::from(start_alpha);
+            debug_assert!(tmp <= 256);
+            // was (tmp >> 7), but that seems wrong if we're trying to catch 256
+            self.alpha[alpha_offset + x] = (tmp - (tmp >> 8)) as u8;
+
+            runs_offset += x + 1;
+            alpha_offset += x + 1;
+            x = 0;
+        }
+
+        if middle_count != 0 {
+            Self::break_run(
+                &mut self.runs[runs_offset..],
+                &mut self.alpha[alpha_offset..],
+                x,
+                middle_count,
+            );
+            alpha_offset += x;
+            runs_offset += x;
+            x = 0;
+            loop {
+                let a = Self::catch_overflow(
+                    u16::from(self.alpha[alpha_offset]) + u16::from(max_value),
+                );
+                self.alpha[alpha_offset] = a;
+
+                let n = usize::from(self.runs[runs_offset].unwrap().get());
+                debug_assert!(n <= middle_count);
+                alpha_offset += n;
+                runs_offset += n;
+                middle_count -= n;
+
+                if middle_count == 0 {
+                    break;
+                }
+            }
+
+            last_alpha_offset = alpha_offset;
+        }
+
+        if stop_alpha != 0 {
+            Self::break_run(
+                &mut self.runs[runs_offset..],
+                &mut self.alpha[alpha_offset..],
+                x,
+                1,
+            );
+            alpha_offset += x;
+            self.alpha[alpha_offset] = (self.alpha[alpha_offset] + stop_alpha) as u8;
+            last_alpha_offset = alpha_offset;
+        }
+
+        // new offset_x
+        last_alpha_offset
+    }
+
+    /// Break the runs in the buffer at offsets x and x+count, properly
+    /// updating the runs to the right and left.
+    ///
+    /// i.e. from the state AAAABBBB, run-length encoded as A4B4,
+    /// break_run(..., 2, 5) would produce AAAABBBB rle as A2A2B3B1.
+    /// Allows add() to sum another run to some of the new sub-runs.
+    /// i.e. adding ..CCCCC. would produce AADDEEEB, rle as A2D2E3B1.
+    fn break_run(runs: &mut [AlphaRun], alpha: &mut [u8], mut x: usize, count: usize) {
+        debug_assert!(count > 0);
+
+        let orig_x = x;
+        let mut runs_offset = 0;
+        let mut alpha_offset = 0;
+
+        while x > 0 {
+            let n = usize::from(runs[runs_offset].unwrap().get());
+            debug_assert!(n > 0);
+
+            if x < n {
+                alpha[alpha_offset + x] = alpha[alpha_offset];
+                runs[runs_offset + 0] = NonZeroU16::new(x as u16);
+                runs[runs_offset + x] = NonZeroU16::new((n - x) as u16);
+                break;
+            }
+            runs_offset += n;
+            alpha_offset += n;
+            x -= n;
+        }
+
+        runs_offset = orig_x;
+        alpha_offset = orig_x;
+        x = count;
+
+        loop {
+            let n = usize::from(runs[runs_offset].unwrap().get());
+            debug_assert!(n > 0);
+
+            if x < n {
+                alpha[alpha_offset + x] = alpha[alpha_offset];
+                runs[runs_offset + 0] = NonZeroU16::new(x as u16);
+                runs[runs_offset + x] = NonZeroU16::new((n - x) as u16);
+                break;
+            }
+
+            x -= n;
+            if x == 0 {
+                break;
+            }
+
+            runs_offset += n;
+            alpha_offset += n;
+        }
+    }
+
+    /// Cut (at offset x in the buffer) a run into two shorter runs with
+    /// matching alpha values.
+    ///
+    /// Used by the RectClipBlitter to trim a RLE encoding to match the
+    /// clipping rectangle.
+    pub fn break_at(alpha: &mut [AlphaU8], runs: &mut [AlphaRun], mut x: i32) {
+        let mut alpha_i = 0;
+        let mut run_i = 0;
+        while x > 0 {
+            let n = runs[run_i].unwrap().get();
+            let n_usize = usize::from(n);
+            let n_i32 = i32::from(n);
+            if x < n_i32 {
+                alpha[alpha_i + x as usize] = alpha[alpha_i];
+                runs[0] = NonZeroU16::new(x as u16);
+                runs[x as usize] = NonZeroU16::new((n_i32 - x) as u16);
+                break;
+            }
+
+            run_i += n_usize;
+            alpha_i += n_usize;
+            x -= n_i32;
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/src/blend_mode.rs
+++ b/third-party/vendor/tiny-skia/src/blend_mode.rs
@ -0,0 +1,132 @@
+use crate::pipeline;
+
+/// A blending mode.
+#[derive(Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Debug)]
+pub enum BlendMode {
+    /// Replaces destination with zero: fully transparent.
+    Clear,
+    /// Replaces destination.
+    Source,
+    /// Preserves destination.
+    Destination,
+    /// Source over destination.
+    SourceOver,
+    /// Destination over source.
+    DestinationOver,
+    /// Source trimmed inside destination.
+    SourceIn,
+    /// Destination trimmed by source.
+    DestinationIn,
+    /// Source trimmed outside destination.
+    SourceOut,
+    /// Destination trimmed outside source.
+    DestinationOut,
+    /// Source inside destination blended with destination.
+    SourceAtop,
+    /// Destination inside source blended with source.
+    DestinationAtop,
+    /// Each of source and destination trimmed outside the other.
+    Xor,
+    /// Sum of colors.
+    Plus,
+    /// Product of premultiplied colors; darkens destination.
+    Modulate,
+    /// Multiply inverse of pixels, inverting result; brightens destination.
+    Screen,
+    /// Multiply or screen, depending on destination.
+    Overlay,
+    /// Darker of source and destination.
+    Darken,
+    /// Lighter of source and destination.
+    Lighten,
+    /// Brighten destination to reflect source.
+    ColorDodge,
+    /// Darken destination to reflect source.
+    ColorBurn,
+    /// Multiply or screen, depending on source.
+    HardLight,
+    /// Lighten or darken, depending on source.
+    SoftLight,
+    /// Subtract darker from lighter with higher contrast.
+    Difference,
+    /// Subtract darker from lighter with lower contrast.
+    Exclusion,
+    /// Multiply source with destination, darkening image.
+    Multiply,
+    /// Hue of source with saturation and luminosity of destination.
+    Hue,
+    /// Saturation of source with hue and luminosity of destination.
+    Saturation,
+    /// Hue and saturation of source with luminosity of destination.
+    Color,
+    /// Luminosity of source with hue and saturation of destination.
+    Luminosity,
+}
+
+impl Default for BlendMode {
+    fn default() -> Self {
+        BlendMode::SourceOver
+    }
+}
+
+impl BlendMode {
+    pub(crate) fn should_pre_scale_coverage(self) -> bool {
+        // The most important things we do here are:
+        //   1) never pre-scale with rgb coverage if the blend mode involves a source-alpha term;
+        //   2) always pre-scale Plus.
+        //
+        // When we pre-scale with rgb coverage, we scale each of source r,g,b, with a distinct value,
+        // and source alpha with one of those three values. This process destructively updates the
+        // source-alpha term, so we can't evaluate blend modes that need its original value.
+        //
+        // Plus always requires pre-scaling as a specific quirk of its implementation in
+        // RasterPipeline. This lets us put the clamp inside the blend mode itself rather
+        // than as a separate stage that'd come after the lerp.
+        //
+        // This function is a finer-grained breakdown of SkBlendMode_SupportsCoverageAsAlpha().
+        matches!(
+            self,
+            BlendMode::Destination |        // d              --> no sa term, ok!
+            BlendMode::DestinationOver |    // d + s*inv(da)  --> no sa term, ok!
+            BlendMode::Plus |               // clamp(s+d)     --> no sa term, ok!
+            BlendMode::DestinationOut |     // d * inv(sa)
+            BlendMode::SourceAtop |         // s*da + d*inv(sa)
+            BlendMode::SourceOver |         // s + d*inv(sa)
+            BlendMode::Xor // s*inv(da) + d*inv(sa)
+        )
+    }
+
+    pub(crate) fn to_stage(self) -> Option<pipeline::Stage> {
+        match self {
+            BlendMode::Clear => Some(pipeline::Stage::Clear),
+            BlendMode::Source => None, // This stage is a no-op.
+            BlendMode::Destination => Some(pipeline::Stage::MoveDestinationToSource),
+            BlendMode::SourceOver => Some(pipeline::Stage::SourceOver),
+            BlendMode::DestinationOver => Some(pipeline::Stage::DestinationOver),
+            BlendMode::SourceIn => Some(pipeline::Stage::SourceIn),
+            BlendMode::DestinationIn => Some(pipeline::Stage::DestinationIn),
+            BlendMode::SourceOut => Some(pipeline::Stage::SourceOut),
+            BlendMode::DestinationOut => Some(pipeline::Stage::DestinationOut),
+            BlendMode::SourceAtop => Some(pipeline::Stage::SourceAtop),
+            BlendMode::DestinationAtop => Some(pipeline::Stage::DestinationAtop),
+            BlendMode::Xor => Some(pipeline::Stage::Xor),
+            BlendMode::Plus => Some(pipeline::Stage::Plus),
+            BlendMode::Modulate => Some(pipeline::Stage::Modulate),
+            BlendMode::Screen => Some(pipeline::Stage::Screen),
+            BlendMode::Overlay => Some(pipeline::Stage::Overlay),
+            BlendMode::Darken => Some(pipeline::Stage::Darken),
+            BlendMode::Lighten => Some(pipeline::Stage::Lighten),
+            BlendMode::ColorDodge => Some(pipeline::Stage::ColorDodge),
+            BlendMode::ColorBurn => Some(pipeline::Stage::ColorBurn),
+            BlendMode::HardLight => Some(pipeline::Stage::HardLight),
+            BlendMode::SoftLight => Some(pipeline::Stage::SoftLight),
+            BlendMode::Difference => Some(pipeline::Stage::Difference),
+            BlendMode::Exclusion => Some(pipeline::Stage::Exclusion),
+            BlendMode::Multiply => Some(pipeline::Stage::Multiply),
+            BlendMode::Hue => Some(pipeline::Stage::Hue),
+            BlendMode::Saturation => Some(pipeline::Stage::Saturation),
+            BlendMode::Color => Some(pipeline::Stage::Color),
+            BlendMode::Luminosity => Some(pipeline::Stage::Luminosity),
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/src/blitter.rs
+++ b/third-party/vendor/tiny-skia/src/blitter.rs
@ -0,0 +1,78 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use tiny_skia_path::ScreenIntRect;
+
+use crate::alpha_runs::AlphaRun;
+use crate::color::AlphaU8;
+use crate::LengthU32;
+
+/// Mask is used to describe alpha bitmaps.
+pub struct Mask {
+    pub image: [u8; 2],
+    pub bounds: ScreenIntRect,
+    pub row_bytes: u32,
+}
+
+/// Blitter is responsible for actually writing pixels into memory.
+///
+/// Besides efficiency, they handle clipping and antialiasing.
+/// An object that implements Blitter contains all the context needed to generate pixels
+/// for the destination and how src/generated pixels map to the destination.
+/// The coordinates passed to the `blit_*` calls are in destination pixel space.
+pub trait Blitter {
+    /// Blits a horizontal run of one or more pixels.
+    fn blit_h(&mut self, _x: u32, _y: u32, _width: LengthU32) {
+        unreachable!()
+    }
+
+    /// Blits a horizontal run of antialiased pixels.
+    ///
+    /// runs[] is a *sparse* zero-terminated run-length encoding of spans of constant alpha values.
+    ///
+    /// The runs[] and antialias[] work together to represent long runs of pixels with the same
+    /// alphas. The runs[] contains the number of pixels with the same alpha, and antialias[]
+    /// contain the coverage value for that number of pixels. The runs[] (and antialias[]) are
+    /// encoded in a clever way. The runs array is zero terminated, and has enough entries for
+    /// each pixel plus one, in most cases some of the entries will not contain valid data. An entry
+    /// in the runs array contains the number of pixels (np) that have the same alpha value. The
+    /// next np value is found np entries away. For example, if runs[0] = 7, then the next valid
+    /// entry will by at runs[7]. The runs array and antialias[] are coupled by index. So, if the
+    /// np entry is at runs[45] = 12 then the alpha value can be found at antialias[45] = 0x88.
+    /// This would mean to use an alpha value of 0x88 for the next 12 pixels starting at pixel 45.
+    fn blit_anti_h(
+        &mut self,
+        _x: u32,
+        _y: u32,
+        _antialias: &mut [AlphaU8],
+        _runs: &mut [AlphaRun],
+    ) {
+        unreachable!()
+    }
+
+    /// Blits a vertical run of pixels with a constant alpha value.
+    fn blit_v(&mut self, _x: u32, _y: u32, _height: LengthU32, _alpha: AlphaU8) {
+        unreachable!()
+    }
+
+    fn blit_anti_h2(&mut self, _x: u32, _y: u32, _alpha0: AlphaU8, _alpha1: AlphaU8) {
+        unreachable!()
+    }
+
+    fn blit_anti_v2(&mut self, _x: u32, _y: u32, _alpha0: AlphaU8, _alpha1: AlphaU8) {
+        unreachable!()
+    }
+
+    /// Blits a solid rectangle one or more pixels wide.
+    fn blit_rect(&mut self, _rect: &ScreenIntRect) {
+        unreachable!()
+    }
+
+    /// Blits a pattern of pixels defined by a rectangle-clipped mask.
+    fn blit_mask(&mut self, _mask: &Mask, _clip: &ScreenIntRect) {
+        unreachable!()
+    }
+}
--- a/third-party/vendor/tiny-skia/src/clip.rs
+++ b/third-party/vendor/tiny-skia/src/clip.rs
@ -0,0 +1,265 @@
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use alloc::vec::Vec;
+
+use tiny_skia_path::{IntRect, IntSize, ScreenIntRect, Transform};
+
+use crate::{FillRule, LengthU32, Path};
+use crate::{ALPHA_U8_OPAQUE, ALPHA_U8_TRANSPARENT};
+
+use crate::alpha_runs::AlphaRun;
+use crate::blitter::Blitter;
+use crate::color::AlphaU8;
+use crate::math::LENGTH_U32_ONE;
+use crate::painter::DrawTiler;
+
+use core::num::NonZeroU32;
+
+/// A clipping mask.
+///
+/// Unlike Skia, we're using just a simple 8bit alpha mask.
+/// It's way slower, but easier to implement.
+#[derive(Clone, Debug)]
+pub struct ClipMask {
+    data: Vec<u8>,
+    width: LengthU32,
+    height: LengthU32,
+}
+
+impl Default for ClipMask {
+    fn default() -> Self {
+        ClipMask {
+            data: Vec::new(),
+            width: LENGTH_U32_ONE,
+            height: LENGTH_U32_ONE,
+        }
+    }
+}
+
+impl ClipMask {
+    /// Creates a new, empty mask.
+    pub fn new() -> Self {
+        ClipMask::default()
+    }
+
+    /// Checks that mask is empty.
+    pub fn is_empty(&self) -> bool {
+        self.data.is_empty()
+    }
+
+    /// Returns mask size.
+    pub(crate) fn size(&self) -> IntSize {
+        IntSize::from_wh(self.width.get(), self.height.get()).unwrap()
+    }
+
+    pub(crate) fn as_submask<'a>(&'a self) -> SubClipMaskRef<'a> {
+        SubClipMaskRef {
+            size: self.size(),
+            real_width: self.width,
+            data: &self.data,
+        }
+    }
+
+    pub(crate) fn submask<'a>(&'a self, rect: IntRect) -> Option<SubClipMaskRef<'a>> {
+        let rect = self.size().to_int_rect(0, 0).intersect(&rect)?;
+        let row_bytes = self.width.get() as usize;
+        let offset = rect.top() as usize * row_bytes + rect.left() as usize;
+
+        Some(SubClipMaskRef {
+            size: rect.size(),
+            real_width: self.width,
+            data: &self.data[offset..],
+        })
+    }
+
+    pub(crate) fn as_submask_mut<'a>(&'a mut self) -> SubClipMaskMut<'a> {
+        SubClipMaskMut {
+            size: self.size(),
+            real_width: self.width,
+            data: &mut self.data,
+        }
+    }
+
+    pub(crate) fn submask_mut<'a>(&'a mut self, rect: IntRect) -> Option<SubClipMaskMut<'a>> {
+        let rect = self.size().to_int_rect(0, 0).intersect(&rect)?;
+        let row_bytes = self.width.get() as usize;
+        let offset = rect.top() as usize * row_bytes + rect.left() as usize;
+
+        Some(SubClipMaskMut {
+            size: rect.size(),
+            real_width: self.width,
+            data: &mut self.data[offset..],
+        })
+    }
+
+    /// Sets the current clipping path.
+    ///
+    /// Not additive. Overwrites the previous data.
+    ///
+    /// Path must be transformed beforehand.
+    pub fn set_path(
+        &mut self,
+        width: u32,
+        height: u32,
+        path: &Path,
+        fill_rule: FillRule,
+        anti_alias: bool,
+    ) -> Option<()> {
+        let width = NonZeroU32::new(width)?;
+        let height = NonZeroU32::new(height)?;
+
+        self.width = width;
+        self.height = height;
+
+        // Reuse the existing allocation.
+        self.data.clear();
+        self.data.resize((width.get() * height.get()) as usize, 0);
+
+        if let Some(tiler) = DrawTiler::new(width.get(), height.get()) {
+            let mut path = path.clone(); // TODO: avoid cloning
+
+            for tile in tiler {
+                let ts = Transform::from_translate(-(tile.x() as f32), -(tile.y() as f32));
+                path = path.transform(ts)?;
+
+                let submax = self.submask_mut(tile.to_int_rect())?;
+
+                // We're ignoring "errors" here, because `fill_path` will return `None`
+                // when rendering a tile that doesn't have a path on it.
+                // Which is not an error in this case.
+                let clip_rect = tile.size().to_screen_int_rect(0, 0);
+                if anti_alias {
+                    let mut builder = ClipBuilderAA(submax);
+                    let _ =
+                        crate::scan::path_aa::fill_path(&path, fill_rule, &clip_rect, &mut builder);
+                } else {
+                    let mut builder = ClipBuilder(submax);
+                    let _ =
+                        crate::scan::path::fill_path(&path, fill_rule, &clip_rect, &mut builder);
+                }
+
+                let ts = Transform::from_translate(tile.x() as f32, tile.y() as f32);
+                path = path.transform(ts)?;
+            }
+
+            Some(())
+        } else {
+            let clip = ScreenIntRect::from_xywh_safe(0, 0, width, height);
+            if anti_alias {
+                let mut builder = ClipBuilderAA(self.as_submask_mut());
+                crate::scan::path_aa::fill_path(path, fill_rule, &clip, &mut builder)
+            } else {
+                let mut builder = ClipBuilder(self.as_submask_mut());
+                crate::scan::path::fill_path(path, fill_rule, &clip, &mut builder)
+            }
+        }
+    }
+
+    /// Intersects the provided path with the current clipping path.
+    ///
+    /// Path must be transformed beforehand.
+    pub fn intersect_path(
+        &mut self,
+        path: &Path,
+        fill_rule: FillRule,
+        anti_alias: bool,
+    ) -> Option<()> {
+        let mut submask = ClipMask::new();
+        submask.set_path(
+            self.width.get(),
+            self.height.get(),
+            path,
+            fill_rule,
+            anti_alias,
+        )?;
+
+        for (a, b) in self.data.iter_mut().zip(submask.data.iter()) {
+            *a = crate::color::premultiply_u8(*a, *b);
+        }
+
+        Some(())
+    }
+
+    /// Clears the mask.
+    ///
+    /// Internal memory buffer is not deallocated.
+    pub fn clear(&mut self) {
+        // Clear the mask, but keep the allocation.
+        self.data.clear();
+    }
+}
+
+#[derive(Clone, Copy)]
+pub struct SubClipMaskRef<'a> {
+    pub data: &'a [u8],
+    pub size: IntSize,
+    pub real_width: LengthU32,
+}
+
+impl<'a> SubClipMaskRef<'a> {
+    pub(crate) fn clip_mask_ctx(&self) -> crate::pipeline::ClipMaskCtx<'a> {
+        crate::pipeline::ClipMaskCtx {
+            data: &self.data,
+            stride: self.real_width,
+        }
+    }
+}
+
+// Similar to SubPixmapMut.
+pub struct SubClipMaskMut<'a> {
+    pub data: &'a mut [u8],
+    pub size: IntSize,
+    pub real_width: LengthU32,
+}
+
+struct ClipBuilder<'a>(SubClipMaskMut<'a>);
+
+impl Blitter for ClipBuilder<'_> {
+    fn blit_h(&mut self, x: u32, y: u32, width: LengthU32) {
+        let offset = (y * self.0.real_width.get() + x) as usize;
+        for i in 0..width.get() as usize {
+            self.0.data[offset + i] = 255;
+        }
+    }
+}
+
+struct ClipBuilderAA<'a>(SubClipMaskMut<'a>);
+
+impl Blitter for ClipBuilderAA<'_> {
+    fn blit_h(&mut self, x: u32, y: u32, width: LengthU32) {
+        let offset = (y * self.0.real_width.get() + x) as usize;
+        for i in 0..width.get() as usize {
+            self.0.data[offset + i] = 255;
+        }
+    }
+
+    fn blit_anti_h(&mut self, mut x: u32, y: u32, aa: &mut [AlphaU8], runs: &mut [AlphaRun]) {
+        let mut aa_offset = 0;
+        let mut run_offset = 0;
+        let mut run_opt = runs[0];
+        while let Some(run) = run_opt {
+            let width = LengthU32::from(run);
+
+            match aa[aa_offset] {
+                ALPHA_U8_TRANSPARENT => {}
+                ALPHA_U8_OPAQUE => {
+                    self.blit_h(x, y, width);
+                }
+                alpha => {
+                    let offset = (y * self.0.real_width.get() + x) as usize;
+                    for i in 0..width.get() as usize {
+                        self.0.data[offset + i] = alpha;
+                    }
+                }
+            }
+
+            x += width.get();
+            run_offset += usize::from(run.get());
+            aa_offset += usize::from(run.get());
+            run_opt = runs[run_offset];
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/src/color.rs
+++ b/third-party/vendor/tiny-skia/src/color.rs
@ -0,0 +1,490 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use tiny_skia_path::{NormalizedF32, Scalar};
+
+/// 8-bit type for an alpha value. 255 is 100% opaque, zero is 100% transparent.
+pub type AlphaU8 = u8;
+
+/// Represents fully transparent AlphaU8 value.
+pub const ALPHA_U8_TRANSPARENT: AlphaU8 = 0x00;
+
+/// Represents fully opaque AlphaU8 value.
+pub const ALPHA_U8_OPAQUE: AlphaU8 = 0xFF;
+
+/// Represents fully transparent Alpha value.
+pub const ALPHA_TRANSPARENT: NormalizedF32 = NormalizedF32::ZERO;
+
+/// Represents fully opaque Alpha value.
+pub const ALPHA_OPAQUE: NormalizedF32 = NormalizedF32::ONE;
+
+/// A 32-bit RGBA color value.
+///
+/// Byteorder: ABGR
+#[repr(transparent)]
+#[derive(Copy, Clone, PartialEq)]
+pub struct ColorU8(u32);
+
+impl ColorU8 {
+    /// Creates a new color.
+    pub const fn from_rgba(r: u8, g: u8, b: u8, a: u8) -> Self {
+        ColorU8(pack_rgba(r, g, b, a))
+    }
+
+    /// Returns color's red component.
+    pub const fn red(self) -> u8 {
+        self.0.to_ne_bytes()[0]
+    }
+
+    /// Returns color's green component.
+    pub const fn green(self) -> u8 {
+        self.0.to_ne_bytes()[1]
+    }
+
+    /// Returns color's blue component.
+    pub const fn blue(self) -> u8 {
+        self.0.to_ne_bytes()[2]
+    }
+
+    /// Returns color's alpha component.
+    pub const fn alpha(self) -> u8 {
+        self.0.to_ne_bytes()[3]
+    }
+
+    /// Check that color is opaque.
+    ///
+    /// Alpha == 255
+    pub fn is_opaque(&self) -> bool {
+        self.alpha() == ALPHA_U8_OPAQUE
+    }
+
+    /// Returns the value as a primitive type.
+    pub const fn get(self) -> u32 {
+        self.0
+    }
+
+    /// Converts into a premultiplied color.
+    pub fn premultiply(&self) -> PremultipliedColorU8 {
+        let a = self.alpha();
+        if a != ALPHA_U8_OPAQUE {
+            PremultipliedColorU8::from_rgba_unchecked(
+                premultiply_u8(self.red(), a),
+                premultiply_u8(self.green(), a),
+                premultiply_u8(self.blue(), a),
+                a,
+            )
+        } else {
+            PremultipliedColorU8::from_rgba_unchecked(self.red(), self.green(), self.blue(), a)
+        }
+    }
+}
+
+impl core::fmt::Debug for ColorU8 {
+    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
+        f.debug_struct("ColorU8")
+            .field("r", &self.red())
+            .field("g", &self.green())
+            .field("b", &self.blue())
+            .field("a", &self.alpha())
+            .finish()
+    }
+}
+
+/// A 32-bit premultiplied RGBA color value.
+///
+/// Byteorder: ABGR
+#[repr(transparent)]
+#[derive(Copy, Clone, PartialEq)]
+pub struct PremultipliedColorU8(u32);
+
+// Perfectly safe, since u32 is already Pod.
+unsafe impl bytemuck::Zeroable for PremultipliedColorU8 {}
+unsafe impl bytemuck::Pod for PremultipliedColorU8 {}
+
+impl PremultipliedColorU8 {
+    /// A transparent color.
+    pub const TRANSPARENT: Self = PremultipliedColorU8::from_rgba_unchecked(0, 0, 0, 0);
+
+    /// Creates a new premultiplied color.
+    ///
+    /// RGB components must be <= alpha.
+    pub fn from_rgba(r: u8, g: u8, b: u8, a: u8) -> Option<Self> {
+        if r <= a && g <= a && b <= a {
+            Some(PremultipliedColorU8(pack_rgba(r, g, b, a)))
+        } else {
+            None
+        }
+    }
+
+    /// Creates a new color.
+    pub(crate) const fn from_rgba_unchecked(r: u8, g: u8, b: u8, a: u8) -> Self {
+        PremultipliedColorU8(pack_rgba(r, g, b, a))
+    }
+
+    /// Returns color's red component.
+    ///
+    /// The value is <= alpha.
+    pub const fn red(self) -> u8 {
+        self.0.to_ne_bytes()[0]
+    }
+
+    /// Returns color's green component.
+    ///
+    /// The value is <= alpha.
+    pub const fn green(self) -> u8 {
+        self.0.to_ne_bytes()[1]
+    }
+
+    /// Returns color's blue component.
+    ///
+    /// The value is <= alpha.
+    pub const fn blue(self) -> u8 {
+        self.0.to_ne_bytes()[2]
+    }
+
+    /// Returns color's alpha component.
+    pub const fn alpha(self) -> u8 {
+        self.0.to_ne_bytes()[3]
+    }
+
+    /// Check that color is opaque.
+    ///
+    /// Alpha == 255
+    pub fn is_opaque(&self) -> bool {
+        self.alpha() == ALPHA_U8_OPAQUE
+    }
+
+    /// Returns the value as a primitive type.
+    pub const fn get(self) -> u32 {
+        self.0
+    }
+
+    /// Returns a demultiplied color.
+    pub fn demultiply(&self) -> ColorU8 {
+        let alpha = self.alpha();
+        if alpha == ALPHA_U8_OPAQUE {
+            ColorU8(self.0)
+        } else {
+            let a = alpha as f64 / 255.0;
+            ColorU8::from_rgba(
+                (self.red() as f64 / a + 0.5) as u8,
+                (self.green() as f64 / a + 0.5) as u8,
+                (self.blue() as f64 / a + 0.5) as u8,
+                alpha,
+            )
+        }
+    }
+}
+
+impl core::fmt::Debug for PremultipliedColorU8 {
+    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
+        f.debug_struct("PremultipliedColorU8")
+            .field("r", &self.red())
+            .field("g", &self.green())
+            .field("b", &self.blue())
+            .field("a", &self.alpha())
+            .finish()
+    }
+}
+
+/// An RGBA color value, holding four floating point components.
+///
+/// # Guarantees
+///
+/// - All values are in 0..=1 range.
+#[derive(Copy, Clone, PartialEq, Debug)]
+pub struct Color {
+    r: NormalizedF32,
+    g: NormalizedF32,
+    b: NormalizedF32,
+    a: NormalizedF32,
+}
+
+const NV_ZERO: NormalizedF32 = NormalizedF32::ZERO;
+const NV_ONE: NormalizedF32 = NormalizedF32::ONE;
+
+impl Color {
+    /// A transparent color.
+    pub const TRANSPARENT: Color = Color {
+        r: NV_ZERO,
+        g: NV_ZERO,
+        b: NV_ZERO,
+        a: NV_ZERO,
+    };
+    /// A black color.
+    pub const BLACK: Color = Color {
+        r: NV_ZERO,
+        g: NV_ZERO,
+        b: NV_ZERO,
+        a: NV_ONE,
+    };
+    /// A white color.
+    pub const WHITE: Color = Color {
+        r: NV_ONE,
+        g: NV_ONE,
+        b: NV_ONE,
+        a: NV_ONE,
+    };
+
+    /// Creates a new color from 4 components.
+    ///
+    /// All values must be in 0..=1 range.
+    pub fn from_rgba(r: f32, g: f32, b: f32, a: f32) -> Option<Self> {
+        Some(Color {
+            r: NormalizedF32::new(r)?,
+            g: NormalizedF32::new(g)?,
+            b: NormalizedF32::new(b)?,
+            a: NormalizedF32::new(a)?,
+        })
+    }
+
+    /// Creates a new color from 4 components.
+    ///
+    /// u8 will be divided by 255 to get the float component.
+    pub fn from_rgba8(r: u8, g: u8, b: u8, a: u8) -> Self {
+        Color {
+            r: NormalizedF32::new_u8(r),
+            g: NormalizedF32::new_u8(g),
+            b: NormalizedF32::new_u8(b),
+            a: NormalizedF32::new_u8(a),
+        }
+    }
+
+    /// Returns color's red component.
+    ///
+    /// The value is guarantee to be in a 0..=1 range.
+    pub fn red(&self) -> f32 {
+        self.r.get()
+    }
+
+    /// Returns color's green component.
+    ///
+    /// The value is guarantee to be in a 0..=1 range.
+    pub fn green(&self) -> f32 {
+        self.g.get()
+    }
+
+    /// Returns color's blue component.
+    ///
+    /// The value is guarantee to be in a 0..=1 range.
+    pub fn blue(&self) -> f32 {
+        self.b.get()
+    }
+
+    /// Returns color's alpha component.
+    ///
+    /// The value is guarantee to be in a 0..=1 range.
+    pub fn alpha(&self) -> f32 {
+        self.a.get()
+    }
+
+    /// Sets the red component value.
+    ///
+    /// The new value will be clipped to the 0..=1 range.
+    pub fn set_red(&mut self, c: f32) {
+        self.r = NormalizedF32::new_clamped(c);
+    }
+
+    /// Sets the green component value.
+    ///
+    /// The new value will be clipped to the 0..=1 range.
+    pub fn set_green(&mut self, c: f32) {
+        self.g = NormalizedF32::new_clamped(c);
+    }
+
+    /// Sets the blue component value.
+    ///
+    /// The new value will be clipped to the 0..=1 range.
+    pub fn set_blue(&mut self, c: f32) {
+        self.b = NormalizedF32::new_clamped(c);
+    }
+
+    /// Sets the alpha component value.
+    ///
+    /// The new value will be clipped to the 0..=1 range.
+    pub fn set_alpha(&mut self, c: f32) {
+        self.a = NormalizedF32::new_clamped(c);
+    }
+
+    /// Shifts color's opacity.
+    ///
+    /// Essentially, multiplies color's alpha by opacity.
+    ///
+    /// `opacity` will be clamped to the 0..=1 range first.
+    /// The final alpha will also be clamped.
+    pub fn apply_opacity(&mut self, opacity: f32) {
+        self.a = NormalizedF32::new_clamped(self.a.get() * opacity.bound(0.0, 1.0));
+    }
+
+    /// Check that color is opaque.
+    ///
+    /// Alpha == 1.0
+    pub fn is_opaque(&self) -> bool {
+        self.a == ALPHA_OPAQUE
+    }
+
+    /// Converts into a premultiplied color.
+    pub fn premultiply(&self) -> PremultipliedColor {
+        if self.is_opaque() {
+            PremultipliedColor {
+                r: self.r,
+                g: self.g,
+                b: self.b,
+                a: self.a,
+            }
+        } else {
+            PremultipliedColor {
+                r: NormalizedF32::new_clamped(self.r.get() * self.a.get()),
+                g: NormalizedF32::new_clamped(self.g.get() * self.a.get()),
+                b: NormalizedF32::new_clamped(self.b.get() * self.a.get()),
+                a: self.a,
+            }
+        }
+    }
+
+    /// Converts into `ColorU8`.
+    pub fn to_color_u8(&self) -> ColorU8 {
+        let c = color_f32_to_u8(self.r, self.g, self.b, self.a);
+        ColorU8::from_rgba(c[0], c[1], c[2], c[3])
+    }
+}
+
+/// A premultiplied RGBA color value, holding four floating point components.
+///
+/// # Guarantees
+///
+/// - All values are in 0..=1 range.
+/// - RGB components are <= A.
+#[derive(Copy, Clone, PartialEq, Debug)]
+pub struct PremultipliedColor {
+    r: NormalizedF32,
+    g: NormalizedF32,
+    b: NormalizedF32,
+    a: NormalizedF32,
+}
+
+impl PremultipliedColor {
+    /// Returns color's red component.
+    ///
+    /// - The value is guarantee to be in a 0..=1 range.
+    /// - The value is <= alpha.
+    pub fn red(&self) -> f32 {
+        self.r.get()
+    }
+
+    /// Returns color's green component.
+    ///
+    /// - The value is guarantee to be in a 0..=1 range.
+    /// - The value is <= alpha.
+    pub fn green(&self) -> f32 {
+        self.g.get()
+    }
+
+    /// Returns color's blue component.
+    ///
+    /// - The value is guarantee to be in a 0..=1 range.
+    /// - The value is <= alpha.
+    pub fn blue(&self) -> f32 {
+        self.b.get()
+    }
+
+    /// Returns color's alpha component.
+    ///
+    /// - The value is guarantee to be in a 0..=1 range.
+    pub fn alpha(&self) -> f32 {
+        self.a.get()
+    }
+
+    /// Returns a demultiplied color.
+    pub fn demultiply(&self) -> Color {
+        let a = self.a.get();
+        if a == 0.0 {
+            Color::TRANSPARENT
+        } else {
+            Color {
+                r: NormalizedF32::new_clamped(self.r.get() / a),
+                g: NormalizedF32::new_clamped(self.g.get() / a),
+                b: NormalizedF32::new_clamped(self.b.get() / a),
+                a: self.a,
+            }
+        }
+    }
+
+    /// Converts into `PremultipliedColorU8`.
+    pub fn to_color_u8(&self) -> PremultipliedColorU8 {
+        let c = color_f32_to_u8(self.r, self.g, self.b, self.a);
+        PremultipliedColorU8::from_rgba_unchecked(c[0], c[1], c[2], c[3])
+    }
+}
+
+/// Return a*b/255, rounding any fractional bits.
+pub fn premultiply_u8(c: u8, a: u8) -> u8 {
+    let prod = u32::from(c) * u32::from(a) + 128;
+    ((prod + (prod >> 8)) >> 8) as u8
+}
+
+const fn pack_rgba(r: u8, g: u8, b: u8, a: u8) -> u32 {
+    u32::from_ne_bytes([r, g, b, a])
+}
+
+fn color_f32_to_u8(
+    r: NormalizedF32,
+    g: NormalizedF32,
+    b: NormalizedF32,
+    a: NormalizedF32,
+) -> [u8; 4] {
+    [
+        (r.get() * 255.0 + 0.5) as u8,
+        (g.get() * 255.0 + 0.5) as u8,
+        (b.get() * 255.0 + 0.5) as u8,
+        (a.get() * 255.0 + 0.5) as u8,
+    ]
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn premultiply_u8() {
+        assert_eq!(
+            ColorU8::from_rgba(10, 20, 30, 40).premultiply(),
+            PremultipliedColorU8::from_rgba_unchecked(2, 3, 5, 40)
+        );
+    }
+
+    #[test]
+    fn premultiply_u8_opaque() {
+        assert_eq!(
+            ColorU8::from_rgba(10, 20, 30, 255).premultiply(),
+            PremultipliedColorU8::from_rgba_unchecked(10, 20, 30, 255)
+        );
+    }
+
+    #[test]
+    fn demultiply_u8_1() {
+        assert_eq!(
+            PremultipliedColorU8::from_rgba_unchecked(2, 3, 5, 40).demultiply(),
+            ColorU8::from_rgba(13, 19, 32, 40)
+        );
+    }
+
+    #[test]
+    fn demultiply_u8_2() {
+        assert_eq!(
+            PremultipliedColorU8::from_rgba_unchecked(10, 20, 30, 255).demultiply(),
+            ColorU8::from_rgba(10, 20, 30, 255)
+        );
+    }
+
+    #[test]
+    fn demultiply_u8_3() {
+        assert_eq!(
+            PremultipliedColorU8::from_rgba_unchecked(153, 99, 54, 180).demultiply(),
+            ColorU8::from_rgba(217, 140, 77, 180)
+        );
+    }
+}
--- a/third-party/vendor/tiny-skia/src/edge.rs
+++ b/third-party/vendor/tiny-skia/src/edge.rs
@ -0,0 +1,565 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use crate::Point;
+
+use crate::fixed_point::{fdot16, fdot6, FDot16, FDot6};
+use crate::math::left_shift;
+
+/// We store 1<<shift in a (signed) byte, so its maximum value is 1<<6 == 64.
+///
+/// Note that this limits the number of lines we use to approximate a curve.
+/// If we need to increase this, we need to store curve_count in something
+/// larger than i8.
+const MAX_COEFF_SHIFT: i32 = 6;
+
+#[derive(Clone, Debug)]
+pub enum Edge {
+    Line(LineEdge),
+    Quadratic(QuadraticEdge),
+    Cubic(CubicEdge),
+}
+
+impl Edge {
+    pub fn as_line(&self) -> &LineEdge {
+        match self {
+            Edge::Line(line) => line,
+            Edge::Quadratic(quad) => &quad.line,
+            Edge::Cubic(cubic) => &cubic.line,
+        }
+    }
+
+    pub fn as_line_mut(&mut self) -> &mut LineEdge {
+        match self {
+            Edge::Line(line) => line,
+            Edge::Quadratic(quad) => &mut quad.line,
+            Edge::Cubic(cubic) => &mut cubic.line,
+        }
+    }
+}
+
+impl core::ops::Deref for Edge {
+    type Target = LineEdge;
+
+    fn deref(&self) -> &Self::Target {
+        self.as_line()
+    }
+}
+
+impl core::ops::DerefMut for Edge {
+    fn deref_mut(&mut self) -> &mut Self::Target {
+        self.as_line_mut()
+    }
+}
+
+#[derive(Clone, Default, Debug)]
+pub struct LineEdge {
+    // Imitate a linked list.
+    pub prev: Option<u32>,
+    pub next: Option<u32>,
+
+    pub x: FDot16,
+    pub dx: FDot16,
+    pub first_y: i32,
+    pub last_y: i32,
+    pub winding: i8, // 1 or -1
+}
+
+impl LineEdge {
+    pub fn new(p0: Point, p1: Point, shift: i32) -> Option<Self> {
+        let scale = (1 << (shift + 6)) as f32;
+        let mut x0 = (p0.x * scale) as i32;
+        let mut y0 = (p0.y * scale) as i32;
+        let mut x1 = (p1.x * scale) as i32;
+        let mut y1 = (p1.y * scale) as i32;
+
+        let mut winding = 1;
+
+        if y0 > y1 {
+            core::mem::swap(&mut x0, &mut x1);
+            core::mem::swap(&mut y0, &mut y1);
+            winding = -1;
+        }
+
+        let top = fdot6::round(y0);
+        let bottom = fdot6::round(y1);
+
+        // are we a zero-height line?
+        if top == bottom {
+            return None;
+        }
+
+        let slope = fdot6::div(x1 - x0, y1 - y0);
+        let dy = compute_dy(top, y0);
+
+        Some(LineEdge {
+            next: None,
+            prev: None,
+            x: fdot6::to_fdot16(x0 + fdot16::mul(slope, dy)),
+            dx: slope,
+            first_y: top,
+            last_y: bottom - 1,
+            winding,
+        })
+    }
+
+    pub fn is_vertical(&self) -> bool {
+        self.dx == 0
+    }
+
+    fn update(&mut self, mut x0: FDot16, mut y0: FDot16, mut x1: FDot16, mut y1: FDot16) -> bool {
+        debug_assert!(self.winding == 1 || self.winding == -1);
+
+        y0 >>= 10;
+        y1 >>= 10;
+
+        debug_assert!(y0 <= y1);
+
+        let top = fdot6::round(y0);
+        let bottom = fdot6::round(y1);
+
+        // are we a zero-height line?
+        if top == bottom {
+            return false;
+        }
+
+        x0 >>= 10;
+        x1 >>= 10;
+
+        let slope = fdot6::div(x1 - x0, y1 - y0);
+        let dy = compute_dy(top, y0);
+
+        self.x = fdot6::to_fdot16(x0 + fdot16::mul(slope, dy));
+        self.dx = slope;
+        self.first_y = top;
+        self.last_y = bottom - 1;
+
+        true
+    }
+}
+
+#[derive(Clone, Debug)]
+pub struct QuadraticEdge {
+    pub line: LineEdge,
+    pub curve_count: i8,
+    curve_shift: u8, // applied to all dx/ddx/dddx
+    qx: FDot16,
+    qy: FDot16,
+    qdx: FDot16,
+    qdy: FDot16,
+    qddx: FDot16,
+    qddy: FDot16,
+    q_last_x: FDot16,
+    q_last_y: FDot16,
+}
+
+impl QuadraticEdge {
+    pub fn new(points: &[Point], shift: i32) -> Option<Self> {
+        let mut quad = Self::new2(points, shift)?;
+        if quad.update() {
+            Some(quad)
+        } else {
+            None
+        }
+    }
+
+    fn new2(points: &[Point], mut shift: i32) -> Option<Self> {
+        let scale = (1 << (shift + 6)) as f32;
+        let mut x0 = (points[0].x * scale) as i32;
+        let mut y0 = (points[0].y * scale) as i32;
+        let x1 = (points[1].x * scale) as i32;
+        let y1 = (points[1].y * scale) as i32;
+        let mut x2 = (points[2].x * scale) as i32;
+        let mut y2 = (points[2].y * scale) as i32;
+
+        let mut winding = 1;
+        if y0 > y2 {
+            core::mem::swap(&mut x0, &mut x2);
+            core::mem::swap(&mut y0, &mut y2);
+            winding = -1;
+        }
+        debug_assert!(y0 <= y1 && y1 <= y2);
+
+        let top = fdot6::round(y0);
+        let bottom = fdot6::round(y2);
+
+        // are we a zero-height quad (line)?
+        if top == bottom {
+            return None;
+        }
+
+        // compute number of steps needed (1 << shift)
+        {
+            let dx = (left_shift(x1, 1) - x0 - x2) >> 2;
+            let dy = (left_shift(y1, 1) - y0 - y2) >> 2;
+            // This is a little confusing:
+            // before this line, shift is the scale up factor for AA;
+            // after this line, shift is the fCurveShift.
+            shift = diff_to_shift(dx, dy, shift);
+            debug_assert!(shift >= 0);
+        }
+
+        // need at least 1 subdivision for our bias trick
+        if shift == 0 {
+            shift = 1;
+        } else if shift > MAX_COEFF_SHIFT {
+            shift = MAX_COEFF_SHIFT;
+        }
+
+        let curve_count = (1 << shift) as i8;
+
+        // We want to reformulate into polynomial form, to make it clear how we
+        // should forward-difference.
+        //
+        // p0 (1 - t)^2 + p1 t(1 - t) + p2 t^2 ==> At^2 + Bt + C
+        //
+        // A = p0 - 2p1 + p2
+        // B = 2(p1 - p0)
+        // C = p0
+        //
+        // Our caller must have constrained our inputs (p0..p2) to all fit into
+        // 16.16. However, as seen above, we sometimes compute values that can be
+        // larger (e.g. B = 2*(p1 - p0)). To guard against overflow, we will store
+        // A and B at 1/2 of their actual value, and just apply a 2x scale during
+        // application in updateQuadratic(). Hence we store (shift - 1) in
+        // curve_shift.
+
+        let curve_shift = (shift - 1) as u8;
+
+        let mut a = fdot6_to_fixed_div2(x0 - x1 - x1 + x2); // 1/2 the real value
+        let mut b = fdot6::to_fdot16(x1 - x0); // 1/2 the real value
+
+        let qx = fdot6::to_fdot16(x0);
+        let qdx = b + (a >> shift); // biased by shift
+        let qddx = a >> (shift - 1); // biased by shift
+
+        a = fdot6_to_fixed_div2(y0 - y1 - y1 + y2); // 1/2 the real value
+        b = fdot6::to_fdot16(y1 - y0); // 1/2 the real value
+
+        let qy = fdot6::to_fdot16(y0);
+        let qdy = b + (a >> shift); // biased by shift
+        let qddy = a >> (shift - 1); // biased by shift
+
+        let q_last_x = fdot6::to_fdot16(x2);
+        let q_last_y = fdot6::to_fdot16(y2);
+
+        Some(QuadraticEdge {
+            line: LineEdge {
+                next: None,
+                prev: None,
+                x: 0,
+                dx: 0,
+                first_y: 0,
+                last_y: 0,
+                winding,
+            },
+            curve_count,
+            curve_shift,
+            qx,
+            qy,
+            qdx,
+            qdy,
+            qddx,
+            qddy,
+            q_last_x,
+            q_last_y,
+        })
+    }
+
+    pub fn update(&mut self) -> bool {
+        let mut success;
+        let mut count = self.curve_count;
+        let mut oldx = self.qx;
+        let mut oldy = self.qy;
+        let mut dx = self.qdx;
+        let mut dy = self.qdy;
+        let mut newx;
+        let mut newy;
+        let shift = self.curve_shift;
+
+        debug_assert!(count > 0);
+
+        loop {
+            count -= 1;
+            if count > 0 {
+                newx = oldx + (dx >> shift);
+                dx += self.qddx;
+                newy = oldy + (dy >> shift);
+                dy += self.qddy;
+            } else {
+                // last segment
+                newx = self.q_last_x;
+                newy = self.q_last_y;
+            }
+            success = self.line.update(oldx, oldy, newx, newy);
+            oldx = newx;
+            oldy = newy;
+
+            if count == 0 || success {
+                break;
+            }
+        }
+
+        self.qx = newx;
+        self.qy = newy;
+        self.qdx = dx;
+        self.qdy = dy;
+        self.curve_count = count as i8;
+
+        success
+    }
+}
+
+#[derive(Clone, Debug)]
+pub struct CubicEdge {
+    pub line: LineEdge,
+    pub curve_count: i8,
+    curve_shift: u8, // applied to all dx/ddx/dddx except for dshift exception
+    dshift: u8,      // applied to cdx and cdy
+    cx: FDot16,
+    cy: FDot16,
+    cdx: FDot16,
+    cdy: FDot16,
+    cddx: FDot16,
+    cddy: FDot16,
+    cdddx: FDot16,
+    cdddy: FDot16,
+    c_last_x: FDot16,
+    c_last_y: FDot16,
+}
+
+impl CubicEdge {
+    pub fn new(points: &[Point], shift: i32) -> Option<Self> {
+        let mut cubic = Self::new2(points, shift, true)?;
+        if cubic.update() {
+            Some(cubic)
+        } else {
+            None
+        }
+    }
+
+    fn new2(points: &[Point], mut shift: i32, sort_y: bool) -> Option<Self> {
+        let scale = (1 << (shift + 6)) as f32;
+        let mut x0 = (points[0].x * scale) as i32;
+        let mut y0 = (points[0].y * scale) as i32;
+        let mut x1 = (points[1].x * scale) as i32;
+        let mut y1 = (points[1].y * scale) as i32;
+        let mut x2 = (points[2].x * scale) as i32;
+        let mut y2 = (points[2].y * scale) as i32;
+        let mut x3 = (points[3].x * scale) as i32;
+        let mut y3 = (points[3].y * scale) as i32;
+
+        let mut winding = 1;
+        if sort_y && y0 > y3 {
+            core::mem::swap(&mut x0, &mut x3);
+            core::mem::swap(&mut x1, &mut x2);
+            core::mem::swap(&mut y0, &mut y3);
+            core::mem::swap(&mut y1, &mut y2);
+            winding = -1;
+        }
+
+        let top = fdot6::round(y0);
+        let bot = fdot6::round(y3);
+
+        // are we a zero-height cubic (line)?
+        if sort_y && top == bot {
+            return None;
+        }
+
+        // compute number of steps needed (1 << shift)
+        {
+            // Can't use (center of curve - center of baseline), since center-of-curve
+            // need not be the max delta from the baseline (it could even be coincident)
+            // so we try just looking at the two off-curve points
+            let dx = cubic_delta_from_line(x0, x1, x2, x3);
+            let dy = cubic_delta_from_line(y0, y1, y2, y3);
+            // add 1 (by observation)
+            shift = diff_to_shift(dx, dy, 2) + 1;
+        }
+        // need at least 1 subdivision for our bias trick
+        debug_assert!(shift > 0);
+        if shift > MAX_COEFF_SHIFT {
+            shift = MAX_COEFF_SHIFT;
+        }
+
+        // Since our in coming data is initially shifted down by 10 (or 8 in
+        // antialias). That means the most we can shift up is 8. However, we
+        // compute coefficients with a 3*, so the safest upshift is really 6
+        let mut up_shift = 6; // largest safe value
+        let mut down_shift = shift + up_shift - 10;
+        if down_shift < 0 {
+            down_shift = 0;
+            up_shift = 10 - shift;
+        }
+
+        let curve_count = left_shift(-1, shift) as i8;
+        let curve_shift = shift as u8;
+        let dshift = down_shift as u8;
+
+        let mut b = fdot6_up_shift(3 * (x1 - x0), up_shift);
+        let mut c = fdot6_up_shift(3 * (x0 - x1 - x1 + x2), up_shift);
+        let mut d = fdot6_up_shift(x3 + 3 * (x1 - x2) - x0, up_shift);
+
+        let cx = fdot6::to_fdot16(x0);
+        let cdx = b + (c >> shift) + (d >> (2 * shift)); // biased by shift
+        let cddx = 2 * c + ((3 * d) >> (shift - 1)); // biased by 2*shift
+        let cdddx = (3 * d) >> (shift - 1); // biased by 2*shift
+
+        b = fdot6_up_shift(3 * (y1 - y0), up_shift);
+        c = fdot6_up_shift(3 * (y0 - y1 - y1 + y2), up_shift);
+        d = fdot6_up_shift(y3 + 3 * (y1 - y2) - y0, up_shift);
+
+        let cy = fdot6::to_fdot16(y0);
+        let cdy = b + (c >> shift) + (d >> (2 * shift)); // biased by shift
+        let cddy = 2 * c + ((3 * d) >> (shift - 1)); // biased by 2*shift
+        let cdddy = (3 * d) >> (shift - 1); // biased by 2*shift
+
+        let c_last_x = fdot6::to_fdot16(x3);
+        let c_last_y = fdot6::to_fdot16(y3);
+
+        Some(CubicEdge {
+            line: LineEdge {
+                next: None,
+                prev: None,
+                x: 0,
+                dx: 0,
+                first_y: 0,
+                last_y: 0,
+                winding,
+            },
+            curve_count,
+            curve_shift,
+            dshift,
+            cx,
+            cy,
+            cdx,
+            cdy,
+            cddx,
+            cddy,
+            cdddx,
+            cdddy,
+            c_last_x,
+            c_last_y,
+        })
+    }
+
+    pub fn update(&mut self) -> bool {
+        let mut success;
+        let mut count = self.curve_count;
+        let mut oldx = self.cx;
+        let mut oldy = self.cy;
+        let mut newx;
+        let mut newy;
+        let ddshift = self.curve_shift;
+        let dshift = self.dshift;
+
+        debug_assert!(count < 0);
+
+        loop {
+            count += 1;
+            if count < 0 {
+                newx = oldx + (self.cdx >> dshift);
+                self.cdx += self.cddx >> ddshift;
+                self.cddx += self.cdddx;
+
+                newy = oldy + (self.cdy >> dshift);
+                self.cdy += self.cddy >> ddshift;
+                self.cddy += self.cdddy;
+            } else {
+                // last segment
+                newx = self.c_last_x;
+                newy = self.c_last_y;
+            }
+
+            // we want to say debug_assert(oldy <= newy), but our finite fixedpoint
+            // doesn't always achieve that, so we have to explicitly pin it here.
+            if newy < oldy {
+                newy = oldy;
+            }
+
+            success = self.line.update(oldx, oldy, newx, newy);
+            oldx = newx;
+            oldy = newy;
+
+            if count == 0 || success {
+                break;
+            }
+        }
+
+        self.cx = newx;
+        self.cy = newy;
+        self.curve_count = count;
+
+        success
+    }
+}
+
+// This correctly favors the lower-pixel when y0 is on a 1/2 pixel boundary
+fn compute_dy(top: FDot6, y0: FDot6) -> FDot6 {
+    left_shift(top, 6) + 32 - y0
+}
+
+fn diff_to_shift(dx: FDot6, dy: FDot6, shift_aa: i32) -> i32 {
+    // cheap calc of distance from center of p0-p2 to the center of the curve
+    let mut dist = cheap_distance(dx, dy);
+
+    // shift down dist (it is currently in dot6)
+    // down by 3 should give us 1/8 pixel accuracy (assuming our dist is accurate...)
+    // this is chosen by heuristic: make it as big as possible (to minimize segments)
+    // ... but small enough so that our curves still look smooth
+    // When shift > 0, we're using AA and everything is scaled up so we can
+    // lower the accuracy.
+    dist = (dist + (1 << 4)) >> (3 + shift_aa);
+
+    // each subdivision (shift value) cuts this dist (error) by 1/4
+    (32 - dist.leading_zeros() as i32) >> 1
+}
+
+fn cheap_distance(mut dx: FDot6, mut dy: FDot6) -> FDot6 {
+    dx = dx.abs();
+    dy = dy.abs();
+    // return max + min/2
+    if dx > dy {
+        dx + (dy >> 1)
+    } else {
+        dy + (dx >> 1)
+    }
+}
+
+// In LineEdge::new, QuadraticEdge::new, CubicEdge::new, the first thing we do is to convert
+// the points into FDot6. This is modulated by the shift parameter, which
+// will either be 0, or something like 2 for antialiasing.
+//
+// In the float case, we want to turn the float into .6 by saying pt * 64,
+// or pt * 256 for antialiasing. This is implemented as 1 << (shift + 6).
+//
+// In the fixed case, we want to turn the fixed into .6 by saying pt >> 10,
+// or pt >> 8 for antialiasing. This is implemented as pt >> (10 - shift).
+fn fdot6_to_fixed_div2(value: FDot6) -> FDot16 {
+    // we want to return SkFDot6ToFixed(value >> 1), but we don't want to throw
+    // away data in value, so just perform a modify up-shift
+    left_shift(value, 16 - 6 - 1)
+}
+
+fn fdot6_up_shift(x: FDot6, up_shift: i32) -> i32 {
+    debug_assert!((left_shift(x, up_shift) >> up_shift) == x);
+    left_shift(x, up_shift)
+}
+
+// f(1/3) = (8a + 12b + 6c + d) / 27
+// f(2/3) = (a + 6b + 12c + 8d) / 27
+//
+// f(1/3)-b = (8a - 15b + 6c + d) / 27
+// f(2/3)-c = (a + 6b - 15c + 8d) / 27
+//
+// use 16/512 to approximate 1/27
+fn cubic_delta_from_line(a: FDot6, b: FDot6, c: FDot6, d: FDot6) -> FDot6 {
+    // since our parameters may be negative, we don't use <<
+    let one_third = ((a * 8 - b * 15 + 6 * c + d) * 19) >> 9;
+    let two_third = ((a + 6 * b - c * 15 + d * 8) * 19) >> 9;
+
+    one_third.abs().max(two_third.abs())
+}
--- a/third-party/vendor/tiny-skia/src/edge_builder.rs
+++ b/third-party/vendor/tiny-skia/src/edge_builder.rs
@ -0,0 +1,349 @@
+// Copyright 2011 Google Inc.
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use alloc::vec::Vec;
+
+use tiny_skia_path::{PathVerb, ScreenIntRect};
+
+use crate::{Path, Point};
+
+use crate::edge::{CubicEdge, Edge, LineEdge, QuadraticEdge};
+use crate::edge_clipper::EdgeClipperIter;
+use crate::path_geometry;
+
+#[derive(Copy, Clone, PartialEq, Debug)]
+enum Combine {
+    No,
+    Partial,
+    Total,
+}
+
+#[derive(Copy, Clone, Debug)]
+pub struct ShiftedIntRect {
+    shifted: ScreenIntRect,
+    shift: i32,
+}
+
+impl ShiftedIntRect {
+    pub fn new(rect: &ScreenIntRect, shift: i32) -> Option<Self> {
+        Some(ShiftedIntRect {
+            shifted: ScreenIntRect::from_xywh(
+                rect.x() << shift,
+                rect.y() << shift,
+                rect.width() << shift,
+                rect.height() << shift,
+            )?,
+            shift,
+        })
+    }
+
+    pub fn shifted(&self) -> &ScreenIntRect {
+        &self.shifted
+    }
+
+    pub fn recover(&self) -> ScreenIntRect {
+        ScreenIntRect::from_xywh(
+            self.shifted.x() >> self.shift,
+            self.shifted.y() >> self.shift,
+            self.shifted.width() >> self.shift,
+            self.shifted.height() >> self.shift,
+        )
+        .unwrap() // cannot fail, because the original rect was valid
+    }
+}
+
+pub struct BasicEdgeBuilder {
+    edges: Vec<Edge>,
+    clip_shift: i32,
+}
+
+impl BasicEdgeBuilder {
+    pub fn new(clip_shift: i32) -> Self {
+        BasicEdgeBuilder {
+            edges: Vec::with_capacity(64), // TODO: stack array + fallback
+            clip_shift,
+        }
+    }
+
+    // Skia returns a linked list here, but it's a nightmare to use in Rust,
+    // so we're mimicking it with Vec.
+    pub fn build_edges(
+        path: &Path,
+        clip: Option<&ShiftedIntRect>,
+        clip_shift: i32,
+    ) -> Option<Vec<Edge>> {
+        // If we're convex, then we need both edges, even if the right edge is past the clip.
+        // let can_cull_to_the_right = !path.isConvex();
+        let can_cull_to_the_right = false; // TODO: this
+
+        let mut builder = BasicEdgeBuilder::new(clip_shift);
+        builder.build(path, clip, can_cull_to_the_right)?;
+
+        if builder.edges.len() < 2 {
+            return None;
+        }
+
+        Some(builder.edges)
+    }
+
+    // TODO: build_poly
+    pub fn build(
+        &mut self,
+        path: &Path,
+        clip: Option<&ShiftedIntRect>,
+        can_cull_to_the_right: bool,
+    ) -> Option<()> {
+        if let Some(clip) = clip {
+            let clip = clip.recover().to_rect();
+            for edges in EdgeClipperIter::new(path, clip, can_cull_to_the_right) {
+                for edge in edges {
+                    match edge {
+                        PathEdge::LineTo(p0, p1) => {
+                            if !p0.is_finite() || !p1.is_finite() {
+                                return None;
+                            }
+
+                            self.push_line(&[p0, p1])
+                        }
+                        PathEdge::QuadTo(p0, p1, p2) => {
+                            if !p0.is_finite() || !p1.is_finite() || !p2.is_finite() {
+                                return None;
+                            }
+
+                            self.push_quad(&[p0, p1, p2])
+                        }
+                        PathEdge::CubicTo(p0, p1, p2, p3) => {
+                            if !p0.is_finite()
+                                || !p1.is_finite()
+                                || !p2.is_finite()
+                                || !p3.is_finite()
+                            {
+                                return None;
+                            }
+
+                            self.push_cubic(&[p0, p1, p2, p3])
+                        }
+                    }
+                }
+            }
+        } else {
+            for edge in edge_iter(path) {
+                match edge {
+                    PathEdge::LineTo(p0, p1) => {
+                        self.push_line(&[p0, p1]);
+                    }
+                    PathEdge::QuadTo(p0, p1, p2) => {
+                        let points = [p0, p1, p2];
+                        let mut mono_x = [Point::zero(); 5];
+                        let n = path_geometry::chop_quad_at_y_extrema(&points, &mut mono_x);
+                        for i in 0..=n {
+                            self.push_quad(&mono_x[i * 2..]);
+                        }
+                    }
+                    PathEdge::CubicTo(p0, p1, p2, p3) => {
+                        let points = [p0, p1, p2, p3];
+                        let mut mono_y = [Point::zero(); 10];
+                        let n = path_geometry::chop_cubic_at_y_extrema(&points, &mut mono_y);
+                        for i in 0..=n {
+                            self.push_cubic(&mono_y[i * 3..]);
+                        }
+                    }
+                }
+            }
+        }
+
+        Some(())
+    }
+
+    fn push_line(&mut self, points: &[Point; 2]) {
+        if let Some(edge) = LineEdge::new(points[0], points[1], self.clip_shift) {
+            let combine = if edge.is_vertical() && !self.edges.is_empty() {
+                if let Some(Edge::Line(last)) = self.edges.last_mut() {
+                    combine_vertical(&edge, last)
+                } else {
+                    Combine::No
+                }
+            } else {
+                Combine::No
+            };
+
+            match combine {
+                Combine::Total => {
+                    self.edges.pop();
+                }
+                Combine::Partial => {}
+                Combine::No => self.edges.push(Edge::Line(edge)),
+            }
+        }
+    }
+
+    fn push_quad(&mut self, points: &[Point]) {
+        if let Some(edge) = QuadraticEdge::new(points, self.clip_shift) {
+            self.edges.push(Edge::Quadratic(edge));
+        }
+    }
+
+    fn push_cubic(&mut self, points: &[Point]) {
+        if let Some(edge) = CubicEdge::new(points, self.clip_shift) {
+            self.edges.push(Edge::Cubic(edge));
+        }
+    }
+}
+
+fn combine_vertical(edge: &LineEdge, last: &mut LineEdge) -> Combine {
+    if last.dx != 0 || edge.x != last.x {
+        return Combine::No;
+    }
+
+    if edge.winding == last.winding {
+        return if edge.last_y + 1 == last.first_y {
+            last.first_y = edge.first_y;
+            Combine::Partial
+        } else if edge.first_y == last.last_y + 1 {
+            last.last_y = edge.last_y;
+            Combine::Partial
+        } else {
+            Combine::No
+        };
+    }
+
+    if edge.first_y == last.first_y {
+        return if edge.last_y == last.last_y {
+            Combine::Total
+        } else if edge.last_y < last.last_y {
+            last.first_y = edge.last_y + 1;
+            Combine::Partial
+        } else {
+            last.first_y = last.last_y + 1;
+            last.last_y = edge.last_y;
+            last.winding = edge.winding;
+            Combine::Partial
+        };
+    }
+
+    if edge.last_y == last.last_y {
+        if edge.first_y > last.first_y {
+            last.last_y = edge.first_y - 1;
+        } else {
+            last.last_y = last.first_y - 1;
+            last.first_y = edge.first_y;
+            last.winding = edge.winding;
+        }
+
+        return Combine::Partial;
+    }
+
+    Combine::No
+}
+
+pub fn edge_iter(path: &Path) -> PathEdgeIter {
+    PathEdgeIter {
+        path,
+        verb_index: 0,
+        points_index: 0,
+        move_to: Point::zero(),
+        needs_close_line: false,
+    }
+}
+
+#[derive(Copy, Clone, PartialEq, Debug)]
+pub enum PathEdge {
+    LineTo(Point, Point),
+    QuadTo(Point, Point, Point),
+    CubicTo(Point, Point, Point, Point),
+}
+
+/// Lightweight variant of PathIter that only returns segments (e.g. lines/quads).
+///
+/// Does not return Move or Close. Always "auto-closes" each contour.
+pub struct PathEdgeIter<'a> {
+    path: &'a Path,
+    verb_index: usize,
+    points_index: usize,
+    move_to: Point,
+    needs_close_line: bool,
+}
+
+impl<'a> PathEdgeIter<'a> {
+    fn close_line(&mut self) -> Option<PathEdge> {
+        self.needs_close_line = false;
+
+        let edge = PathEdge::LineTo(self.path.points()[self.points_index - 1], self.move_to);
+        Some(edge)
+    }
+}
+
+impl<'a> Iterator for PathEdgeIter<'a> {
+    type Item = PathEdge;
+
+    fn next(&mut self) -> Option<Self::Item> {
+        if self.verb_index < self.path.verbs().len() {
+            let verb = self.path.verbs()[self.verb_index];
+            self.verb_index += 1;
+
+            match verb {
+                PathVerb::Move => {
+                    if self.needs_close_line {
+                        let res = self.close_line();
+                        self.move_to = self.path.points()[self.points_index];
+                        self.points_index += 1;
+                        return res;
+                    }
+
+                    self.move_to = self.path.points()[self.points_index];
+                    self.points_index += 1;
+                    self.next()
+                }
+                PathVerb::Close => {
+                    if self.needs_close_line {
+                        return self.close_line();
+                    }
+
+                    self.next()
+                }
+                _ => {
+                    // Actual edge.
+                    self.needs_close_line = true;
+
+                    let edge;
+                    match verb {
+                        PathVerb::Line => {
+                            edge = PathEdge::LineTo(
+                                self.path.points()[self.points_index - 1],
+                                self.path.points()[self.points_index + 0],
+                            );
+                            self.points_index += 1;
+                        }
+                        PathVerb::Quad => {
+                            edge = PathEdge::QuadTo(
+                                self.path.points()[self.points_index - 1],
+                                self.path.points()[self.points_index + 0],
+                                self.path.points()[self.points_index + 1],
+                            );
+                            self.points_index += 2;
+                        }
+                        PathVerb::Cubic => {
+                            edge = PathEdge::CubicTo(
+                                self.path.points()[self.points_index - 1],
+                                self.path.points()[self.points_index + 0],
+                                self.path.points()[self.points_index + 1],
+                                self.path.points()[self.points_index + 2],
+                            );
+                            self.points_index += 3;
+                        }
+                        _ => unreachable!(),
+                    };
+
+                    Some(edge)
+                }
+            }
+        } else if self.needs_close_line {
+            self.close_line()
+        } else {
+            None
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/src/edge_clipper.rs
+++ b/third-party/vendor/tiny-skia/src/edge_clipper.rs
@ -0,0 +1,569 @@
+// Copyright 2009 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use arrayvec::ArrayVec;
+
+use tiny_skia_path::{NormalizedF32Exclusive, SCALAR_MAX};
+
+use crate::{Path, Point, Rect};
+
+use crate::edge_builder::{edge_iter, PathEdge, PathEdgeIter};
+use crate::line_clipper;
+use crate::path_geometry;
+
+#[cfg(all(not(feature = "std"), feature = "no-std-float"))]
+use tiny_skia_path::NoStdFloat;
+
+// This is a fail-safe `arr[n..n+3].try_into().unwrap()` alternative.
+// Everything is checked at compile-time so there is no bound checking and panics.
+macro_rules! copy_3_points {
+    ($arr:expr, $i:expr) => {
+        [$arr[$i], $arr[$i + 1], $arr[$i + 2]]
+    };
+}
+
+macro_rules! copy_4_points {
+    ($arr:expr, $i:expr) => {
+        [$arr[$i], $arr[$i + 1], $arr[$i + 2], $arr[$i + 3]]
+    };
+}
+
+/// Max curvature in X and Y split cubic into 9 pieces, * (line + cubic).
+const MAX_VERBS: usize = 18;
+
+pub type ClippedEdges = ArrayVec<PathEdge, MAX_VERBS>;
+
+pub struct EdgeClipper {
+    clip: Rect,
+    can_cull_to_the_right: bool,
+    edges: ClippedEdges,
+}
+
+impl EdgeClipper {
+    fn new(clip: Rect, can_cull_to_the_right: bool) -> Self {
+        EdgeClipper {
+            clip,
+            can_cull_to_the_right,
+            edges: ArrayVec::new(),
+        }
+    }
+
+    fn clip_line(mut self, p0: Point, p1: Point) -> Option<ClippedEdges> {
+        let mut points = [Point::zero(); line_clipper::MAX_POINTS];
+        let points = line_clipper::clip(
+            &[p0, p1],
+            &self.clip,
+            self.can_cull_to_the_right,
+            &mut points,
+        );
+        if !points.is_empty() {
+            for i in 0..points.len() - 1 {
+                self.push_line(points[i], points[i + 1]);
+            }
+        }
+
+        if self.edges.is_empty() {
+            None
+        } else {
+            Some(self.edges)
+        }
+    }
+
+    fn push_line(&mut self, p0: Point, p1: Point) {
+        self.edges.push(PathEdge::LineTo(p0, p1));
+    }
+
+    fn push_vline(&mut self, x: f32, mut y0: f32, mut y1: f32, reverse: bool) {
+        if reverse {
+            core::mem::swap(&mut y0, &mut y1);
+        }
+
+        self.edges.push(PathEdge::LineTo(
+            Point::from_xy(x, y0),
+            Point::from_xy(x, y1),
+        ));
+    }
+
+    fn clip_quad(mut self, p0: Point, p1: Point, p2: Point) -> Option<ClippedEdges> {
+        let pts = [p0, p1, p2];
+        let bounds = Rect::from_points(&pts)?;
+
+        if !quick_reject(&bounds, &self.clip) {
+            let mut mono_y = [Point::zero(); 5];
+            let count_y = path_geometry::chop_quad_at_y_extrema(&pts, &mut mono_y);
+            for y in 0..=count_y {
+                let mut mono_x = [Point::zero(); 5];
+                let y_points: [Point; 3] = copy_3_points!(mono_y, y * 2);
+                let count_x = path_geometry::chop_quad_at_x_extrema(&y_points, &mut mono_x);
+                for x in 0..=count_x {
+                    let x_points: [Point; 3] = copy_3_points!(mono_x, x * 2);
+                    self.clip_mono_quad(&x_points);
+                }
+            }
+        }
+
+        if self.edges.is_empty() {
+            None
+        } else {
+            Some(self.edges)
+        }
+    }
+
+    // src[] must be monotonic in X and Y
+    fn clip_mono_quad(&mut self, src: &[Point; 3]) {
+        let mut pts = [Point::zero(); 3];
+        let mut reverse = sort_increasing_y(src, &mut pts);
+
+        // are we completely above or below
+        if pts[2].y <= self.clip.top() || pts[0].y >= self.clip.bottom() {
+            return;
+        }
+
+        // Now chop so that pts is contained within clip in Y
+        chop_quad_in_y(&self.clip, &mut pts);
+
+        if pts[0].x > pts[2].x {
+            pts.swap(0, 2);
+            reverse = !reverse;
+        }
+        debug_assert!(pts[0].x <= pts[1].x);
+        debug_assert!(pts[1].x <= pts[2].x);
+
+        // Now chop in X has needed, and record the segments
+
+        if pts[2].x <= self.clip.left() {
+            // wholly to the left
+            self.push_vline(self.clip.left(), pts[0].y, pts[2].y, reverse);
+            return;
+        }
+
+        if pts[0].x >= self.clip.right() {
+            // wholly to the right
+            if !self.can_cull_to_the_right {
+                self.push_vline(self.clip.right(), pts[0].y, pts[2].y, reverse);
+            }
+
+            return;
+        }
+
+        let mut t = NormalizedF32Exclusive::ANY;
+        let mut tmp = [Point::zero(); 5];
+
+        // are we partially to the left
+        if pts[0].x < self.clip.left() {
+            if chop_mono_quad_at_x(&pts, self.clip.left(), &mut t) {
+                path_geometry::chop_quad_at(&pts, t, &mut tmp);
+                self.push_vline(self.clip.left(), tmp[0].y, tmp[2].y, reverse);
+                // clamp to clean up imprecise numerics in the chop
+                tmp[2].x = self.clip.left();
+                tmp[3].x = tmp[3].x.max(self.clip.left());
+
+                pts[0] = tmp[2];
+                pts[1] = tmp[3];
+            } else {
+                // if chopMonoQuadAtY failed, then we may have hit inexact numerics
+                // so we just clamp against the left
+                self.push_vline(self.clip.left(), pts[0].y, pts[2].y, reverse);
+                return;
+            }
+        }
+
+        // are we partially to the right
+        if pts[2].x > self.clip.right() {
+            if chop_mono_quad_at_x(&pts, self.clip.right(), &mut t) {
+                path_geometry::chop_quad_at(&pts, t, &mut tmp);
+                // clamp to clean up imprecise numerics in the chop
+                tmp[1].x = tmp[1].x.min(self.clip.right());
+                tmp[2].x = self.clip.right();
+
+                self.push_quad(&copy_3_points!(tmp, 0), reverse);
+                self.push_vline(self.clip.right(), tmp[2].y, tmp[4].y, reverse);
+            } else {
+                // if chopMonoQuadAtY failed, then we may have hit inexact numerics
+                // so we just clamp against the right
+                pts[1].x = pts[1].x.min(self.clip.right());
+                pts[2].x = pts[2].x.min(self.clip.right());
+                self.push_quad(&pts, reverse);
+            }
+        } else {
+            // wholly inside the clip
+            self.push_quad(&pts, reverse);
+        }
+    }
+
+    fn push_quad(&mut self, pts: &[Point; 3], reverse: bool) {
+        if reverse {
+            self.edges.push(PathEdge::QuadTo(pts[2], pts[1], pts[0]));
+        } else {
+            self.edges.push(PathEdge::QuadTo(pts[0], pts[1], pts[2]));
+        }
+    }
+
+    fn clip_cubic(mut self, p0: Point, p1: Point, p2: Point, p3: Point) -> Option<ClippedEdges> {
+        let pts = [p0, p1, p2, p3];
+        let bounds = Rect::from_points(&pts)?;
+
+        // check if we're clipped out vertically
+        if bounds.bottom() > self.clip.top() && bounds.top() < self.clip.bottom() {
+            if too_big_for_reliable_float_math(&bounds) {
+                // can't safely clip the cubic, so we give up and draw a line (which we can safely clip)
+                //
+                // If we rewrote chopcubicat*extrema and chopmonocubic using doubles, we could very
+                // likely always handle the cubic safely, but (it seems) at a big loss in speed, so
+                // we'd only want to take that alternate impl if needed.
+                return self.clip_line(p0, p3);
+            } else {
+                let mut mono_y = [Point::zero(); 10];
+                let count_y = path_geometry::chop_cubic_at_y_extrema(&pts, &mut mono_y);
+                for y in 0..=count_y {
+                    let mut mono_x = [Point::zero(); 10];
+                    let y_points: [Point; 4] = copy_4_points!(mono_y, y * 3);
+                    let count_x = path_geometry::chop_cubic_at_x_extrema(&y_points, &mut mono_x);
+                    for x in 0..=count_x {
+                        let x_points: [Point; 4] = copy_4_points!(mono_x, x * 3);
+                        self.clip_mono_cubic(&x_points);
+                    }
+                }
+            }
+        }
+
+        if self.edges.is_empty() {
+            None
+        } else {
+            Some(self.edges)
+        }
+    }
+
+    // src[] must be monotonic in X and Y
+    fn clip_mono_cubic(&mut self, src: &[Point; 4]) {
+        let mut pts = [Point::zero(); 4];
+        let mut reverse = sort_increasing_y(src, &mut pts);
+
+        // are we completely above or below
+        if pts[3].y <= self.clip.top() || pts[0].y >= self.clip.bottom() {
+            return;
+        }
+
+        // Now chop so that pts is contained within clip in Y
+        chop_cubic_in_y(&self.clip, &mut pts);
+
+        if pts[0].x > pts[3].x {
+            pts.swap(0, 3);
+            pts.swap(1, 2);
+            reverse = !reverse;
+        }
+
+        // Now chop in X has needed, and record the segments
+
+        if pts[3].x <= self.clip.left() {
+            // wholly to the left
+            self.push_vline(self.clip.left(), pts[0].y, pts[3].y, reverse);
+            return;
+        }
+
+        if pts[0].x >= self.clip.right() {
+            // wholly to the right
+            if !self.can_cull_to_the_right {
+                self.push_vline(self.clip.right(), pts[0].y, pts[3].y, reverse);
+            }
+
+            return;
+        }
+
+        // are we partially to the left
+        if pts[0].x < self.clip.left() {
+            let mut tmp = [Point::zero(); 7];
+            chop_mono_cubic_at_x(&pts, self.clip.left(), &mut tmp);
+            self.push_vline(self.clip.left(), tmp[0].y, tmp[3].y, reverse);
+
+            // tmp[3, 4].fX should all be to the right of clip.left().
+            // Since we can't trust the numerics of
+            // the chopper, we force those conditions now
+            tmp[3].x = self.clip.left();
+            tmp[4].x = tmp[4].x.max(self.clip.left());
+
+            pts[0] = tmp[3];
+            pts[1] = tmp[4];
+            pts[2] = tmp[5];
+        }
+
+        // are we partially to the right
+        if pts[3].x > self.clip.right() {
+            let mut tmp = [Point::zero(); 7];
+            chop_mono_cubic_at_x(&pts, self.clip.right(), &mut tmp);
+            tmp[3].x = self.clip.right();
+            tmp[2].x = tmp[2].x.min(self.clip.right());
+
+            self.push_cubic(&copy_4_points!(tmp, 0), reverse);
+            self.push_vline(self.clip.right(), tmp[3].y, tmp[6].y, reverse);
+        } else {
+            // wholly inside the clip
+            self.push_cubic(&pts, reverse);
+        }
+    }
+
+    fn push_cubic(&mut self, pts: &[Point; 4], reverse: bool) {
+        if reverse {
+            self.edges
+                .push(PathEdge::CubicTo(pts[3], pts[2], pts[1], pts[0]));
+        } else {
+            self.edges
+                .push(PathEdge::CubicTo(pts[0], pts[1], pts[2], pts[3]));
+        }
+    }
+}
+
+pub struct EdgeClipperIter<'a> {
+    edge_iter: PathEdgeIter<'a>,
+    clip: Rect,
+    can_cull_to_the_right: bool,
+}
+
+impl<'a> EdgeClipperIter<'a> {
+    pub fn new(path: &'a Path, clip: Rect, can_cull_to_the_right: bool) -> Self {
+        EdgeClipperIter {
+            edge_iter: edge_iter(path),
+            clip,
+            can_cull_to_the_right,
+        }
+    }
+}
+
+impl Iterator for EdgeClipperIter<'_> {
+    type Item = ClippedEdges;
+
+    fn next(&mut self) -> Option<Self::Item> {
+        for edge in &mut self.edge_iter {
+            let clipper = EdgeClipper::new(self.clip, self.can_cull_to_the_right);
+
+            match edge {
+                PathEdge::LineTo(p0, p1) => {
+                    if let Some(edges) = clipper.clip_line(p0, p1) {
+                        return Some(edges);
+                    }
+                }
+                PathEdge::QuadTo(p0, p1, p2) => {
+                    if let Some(edges) = clipper.clip_quad(p0, p1, p2) {
+                        return Some(edges);
+                    }
+                }
+                PathEdge::CubicTo(p0, p1, p2, p3) => {
+                    if let Some(edges) = clipper.clip_cubic(p0, p1, p2, p3) {
+                        return Some(edges);
+                    }
+                }
+            }
+        }
+
+        None
+    }
+}
+
+fn quick_reject(bounds: &Rect, clip: &Rect) -> bool {
+    bounds.top() >= clip.bottom() || bounds.bottom() <= clip.top()
+}
+
+// src[] must be monotonic in Y. This routine copies src into dst, and sorts
+// it to be increasing in Y. If it had to reverse the order of the points,
+// it returns true, otherwise it returns false
+fn sort_increasing_y(src: &[Point], dst: &mut [Point]) -> bool {
+    // We need the data to be monotonically increasing in Y.
+    // Never fails, because src is always non-empty.
+    if src[0].y > src.last().unwrap().y {
+        for (i, p) in src.iter().rev().enumerate() {
+            dst[i] = *p;
+        }
+
+        true
+    } else {
+        dst[0..src.len()].copy_from_slice(src);
+        false
+    }
+}
+
+/// Modifies pts[] in place so that it is clipped in Y to the clip rect.
+fn chop_quad_in_y(clip: &Rect, pts: &mut [Point; 3]) {
+    let mut t = NormalizedF32Exclusive::ANY;
+    let mut tmp = [Point::zero(); 5];
+
+    // are we partially above
+    if pts[0].y < clip.top() {
+        if chop_mono_quad_at_y(pts, clip.top(), &mut t) {
+            // take the 2nd chopped quad
+            path_geometry::chop_quad_at(pts, t, &mut tmp);
+            // clamp to clean up imprecise numerics in the chop
+            tmp[2].y = clip.top();
+            tmp[3].y = tmp[3].y.max(clip.top());
+
+            pts[0] = tmp[2];
+            pts[1] = tmp[3];
+        } else {
+            // if chop_mono_quad_at_y failed, then we may have hit inexact numerics
+            // so we just clamp against the top
+            for p in pts.iter_mut() {
+                if p.y < clip.top() {
+                    p.y = clip.top();
+                }
+            }
+        }
+    }
+
+    // are we partially below
+    if pts[2].y > clip.bottom() {
+        if chop_mono_quad_at_y(pts, clip.bottom(), &mut t) {
+            path_geometry::chop_quad_at(pts, t, &mut tmp);
+            // clamp to clean up imprecise numerics in the chop
+            tmp[1].y = tmp[1].y.min(clip.bottom());
+            tmp[2].y = clip.bottom();
+
+            pts[1] = tmp[1];
+            pts[2] = tmp[2];
+        } else {
+            // if chop_mono_quad_at_y failed, then we may have hit inexact numerics
+            // so we just clamp against the bottom
+            for p in pts.iter_mut() {
+                if p.y > clip.bottom() {
+                    p.y = clip.bottom();
+                }
+            }
+        }
+    }
+}
+
+fn chop_mono_quad_at_x(pts: &[Point; 3], x: f32, t: &mut NormalizedF32Exclusive) -> bool {
+    chop_mono_quad_at(pts[0].x, pts[1].x, pts[2].x, x, t)
+}
+
+fn chop_mono_quad_at_y(pts: &[Point; 3], y: f32, t: &mut NormalizedF32Exclusive) -> bool {
+    chop_mono_quad_at(pts[0].y, pts[1].y, pts[2].y, y, t)
+}
+
+fn chop_mono_quad_at(
+    c0: f32,
+    c1: f32,
+    c2: f32,
+    target: f32,
+    t: &mut NormalizedF32Exclusive,
+) -> bool {
+    // Solve F(t) = y where F(t) := [0](1-t)^2 + 2[1]t(1-t) + [2]t^2
+    // We solve for t, using quadratic equation, hence we have to rearrange
+    // our coefficients to look like At^2 + Bt + C
+    let a = c0 - c1 - c1 + c2;
+    let b = 2.0 * (c1 - c0);
+    let c = c0 - target;
+
+    let mut roots = path_geometry::new_t_values();
+    let count = path_geometry::find_unit_quad_roots(a, b, c, &mut roots);
+    if count != 0 {
+        *t = roots[0];
+        true
+    } else {
+        false
+    }
+}
+
+fn too_big_for_reliable_float_math(r: &Rect) -> bool {
+    // limit set as the largest float value for which we can still reliably compute things like
+    // - chopping at XY extrema
+    // - chopping at Y or X values for clipping
+    //
+    // Current value chosen just by experiment. Larger (and still succeeds) is always better.
+
+    let limit = (1 << 22) as f32;
+    r.left() < -limit || r.top() < -limit || r.right() > limit || r.bottom() > limit
+}
+
+/// Modifies pts[] in place so that it is clipped in Y to the clip rect.
+fn chop_cubic_in_y(clip: &Rect, pts: &mut [Point; 4]) {
+    // are we partially above
+    if pts[0].y < clip.top() {
+        let mut tmp = [Point::zero(); 7];
+        chop_mono_cubic_at_y(pts, clip.top(), &mut tmp);
+
+        // For a large range in the points, we can do a poor job of chopping, such that the t
+        // we computed resulted in the lower cubic still being partly above the clip.
+        //
+        // If just the first or first 2 Y values are above the fTop, we can just smash them
+        // down. If the first 3 Ys are above fTop, we can't smash all 3, as that can really
+        // distort the cubic. In this case, we take the first output (tmp[3..6] and treat it as
+        // a guess, and re-chop against fTop. Then we fall through to checking if we need to
+        // smash the first 1 or 2 Y values.
+        if tmp[3].y < clip.top() && tmp[4].y < clip.top() && tmp[5].y < clip.top() {
+            let tmp2: [Point; 4] = copy_4_points!(tmp, 3);
+            chop_mono_cubic_at_y(&tmp2, clip.top(), &mut tmp);
+        }
+
+        // tmp[3, 4].y should all be to the below clip.fTop.
+        // Since we can't trust the numerics of the chopper, we force those conditions now
+        tmp[3].y = clip.top();
+        tmp[4].y = tmp[4].y.max(clip.top());
+
+        pts[0] = tmp[3];
+        pts[1] = tmp[4];
+        pts[2] = tmp[5];
+    }
+
+    // are we partially below
+    if pts[3].y > clip.bottom() {
+        let mut tmp = [Point::zero(); 7];
+        chop_mono_cubic_at_y(pts, clip.bottom(), &mut tmp);
+        tmp[3].y = clip.bottom();
+        tmp[2].y = tmp[2].y.min(clip.bottom());
+
+        pts[1] = tmp[1];
+        pts[2] = tmp[2];
+        pts[3] = tmp[3];
+    }
+}
+
+fn chop_mono_cubic_at_x(src: &[Point; 4], x: f32, dst: &mut [Point; 7]) {
+    if path_geometry::chop_mono_cubic_at_x(src, x, dst) {
+        return;
+    }
+
+    let src_values = [src[0].x, src[1].x, src[2].x, src[3].x];
+    path_geometry::chop_cubic_at2(src, mono_cubic_closest_t(&src_values, x), dst);
+}
+
+fn chop_mono_cubic_at_y(src: &[Point; 4], y: f32, dst: &mut [Point; 7]) {
+    if path_geometry::chop_mono_cubic_at_y(src, y, dst) {
+        return;
+    }
+
+    let src_values = [src[0].y, src[1].y, src[2].y, src[3].y];
+    path_geometry::chop_cubic_at2(src, mono_cubic_closest_t(&src_values, y), dst);
+}
+
+fn mono_cubic_closest_t(src: &[f32; 4], mut x: f32) -> NormalizedF32Exclusive {
+    let mut t = 0.5;
+    let mut last_t;
+    let mut best_t = t;
+    let mut step = 0.25;
+    let d = src[0];
+    let a = src[3] + 3.0 * (src[1] - src[2]) - d;
+    let b = 3.0 * (src[2] - src[1] - src[1] + d);
+    let c = 3.0 * (src[1] - d);
+    x -= d;
+    let mut closest = SCALAR_MAX;
+    loop {
+        let loc = ((a * t + b) * t + c) * t;
+        let dist = (loc - x).abs();
+        if closest > dist {
+            closest = dist;
+            best_t = t;
+        }
+
+        last_t = t;
+        t += if loc < x { step } else { -step };
+        step *= 0.5;
+
+        if !(closest > 0.25 && last_t != t) {
+            break;
+        }
+    }
+
+    NormalizedF32Exclusive::new(best_t).unwrap()
+}
--- a/third-party/vendor/tiny-skia/src/fixed_point.rs
+++ b/third-party/vendor/tiny-skia/src/fixed_point.rs
@ -0,0 +1,130 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+// Skia uses fixed points pretty chaotically, therefore we cannot use
+// strongly typed wrappers. Which is unfortunate.
+
+use tiny_skia_path::SaturateCast;
+
+use crate::math::{bound, left_shift, left_shift64};
+
+/// A 26.6 fixed point.
+pub type FDot6 = i32;
+
+/// A 24.8 fixed point.
+pub type FDot8 = i32;
+
+/// A 16.16 fixed point.
+pub type FDot16 = i32;
+
+pub mod fdot6 {
+    use super::*;
+    use core::convert::TryFrom;
+
+    pub const ONE: FDot6 = 64;
+
+    pub fn from_i32(n: i32) -> FDot6 {
+        debug_assert!(n as i16 as i32 == n);
+        n << 6
+    }
+
+    pub fn from_f32(n: f32) -> FDot6 {
+        (n * 64.0) as i32
+    }
+
+    pub fn floor(n: FDot6) -> FDot6 {
+        n >> 6
+    }
+
+    pub fn ceil(n: FDot6) -> FDot6 {
+        (n + 63) >> 6
+    }
+
+    pub fn round(n: FDot6) -> FDot6 {
+        (n + 32) >> 6
+    }
+
+    pub fn to_fdot16(n: FDot6) -> FDot16 {
+        debug_assert!((left_shift(n, 10) >> 10) == n);
+        left_shift(n, 10)
+    }
+
+    pub fn div(a: FDot6, b: FDot6) -> FDot16 {
+        debug_assert_ne!(b, 0);
+
+        if i16::try_from(a).is_ok() {
+            left_shift(a, 16) / b
+        } else {
+            fdot16::div(a, b)
+        }
+    }
+
+    pub fn can_convert_to_fdot16(n: FDot6) -> bool {
+        let max_dot6 = core::i32::MAX >> (16 - 6);
+        n.abs() <= max_dot6
+    }
+
+    pub fn small_scale(value: u8, dot6: FDot6) -> u8 {
+        debug_assert!(dot6 as u32 <= 64);
+        ((value as i32 * dot6) >> 6) as u8
+    }
+}
+
+pub mod fdot8 {
+    use super::*;
+
+    // Extracted from SkScan_Antihair.cpp
+
+    pub fn from_fdot16(x: FDot16) -> FDot8 {
+        (x + 0x80) >> 8
+    }
+}
+
+pub mod fdot16 {
+    use super::*;
+
+    pub const HALF: FDot16 = (1 << 16) / 2;
+    pub const ONE: FDot16 = 1 << 16;
+
+    // `from_f32` seems to lack a rounding step. For all fixed-point
+    // values, this version is as accurate as possible for (fixed -> float -> fixed). Rounding reduces
+    // accuracy if the intermediate floats are in the range that only holds integers (adding 0.5 to an
+    // odd integer then snaps to nearest even). Using double for the rounding math gives maximum
+    // accuracy for (float -> fixed -> float), but that's usually overkill.
+    pub fn from_f32(x: f32) -> FDot16 {
+        i32::saturate_from(x * ONE as f32)
+    }
+
+    pub fn floor_to_i32(x: FDot16) -> i32 {
+        x >> 16
+    }
+
+    pub fn ceil_to_i32(x: FDot16) -> i32 {
+        (x + ONE - 1) >> 16
+    }
+
+    pub fn round_to_i32(x: FDot16) -> i32 {
+        (x + HALF) >> 16
+    }
+
+    // The divide may exceed 32 bits. Clamp to a signed 32 bit result.
+    pub fn mul(a: FDot16, b: FDot16) -> FDot16 {
+        ((i64::from(a) * i64::from(b)) >> 16) as FDot16
+    }
+
+    // The divide may exceed 32 bits. Clamp to a signed 32 bit result.
+    pub fn div(numer: FDot6, denom: FDot6) -> FDot16 {
+        let v = left_shift64(numer as i64, 16) / denom as i64;
+        let n = bound(i32::MIN as i64, v, i32::MAX as i64);
+        n as i32
+    }
+
+    pub fn fast_div(a: FDot6, b: FDot6) -> FDot16 {
+        debug_assert!((left_shift(a, 16) >> 16) == a);
+        debug_assert!(b != 0);
+        left_shift(a, 16) / b
+    }
+}
--- a/third-party/vendor/tiny-skia/src/lib.rs
+++ b/third-party/vendor/tiny-skia/src/lib.rs
@ -0,0 +1,72 @@
+/*!
+`tiny-skia` is a tiny [Skia](https://skia.org/) subset ported to Rust.
+
+`tiny-skia` API is a bit unconventional.
+It doesn't look like cairo, QPainter (Qt), HTML Canvas or even Skia itself.
+Instead, `tiny-skia` provides a set of low-level drawing APIs
+and a user should manage the world transform, clipping mask and style manually.
+
+See the `examples/` directory for usage examples.
+*/
+
+#![no_std]
+#![warn(missing_docs)]
+#![warn(missing_copy_implementations)]
+#![warn(missing_debug_implementations)]
+#![allow(clippy::approx_constant)]
+#![allow(clippy::clone_on_copy)]
+#![allow(clippy::collapsible_else_if)]
+#![allow(clippy::collapsible_if)]
+#![allow(clippy::comparison_chain)]
+#![allow(clippy::enum_variant_names)]
+#![allow(clippy::excessive_precision)]
+#![allow(clippy::identity_op)]
+#![allow(clippy::manual_range_contains)]
+#![allow(clippy::needless_range_loop)]
+#![allow(clippy::too_many_arguments)]
+#![allow(clippy::wrong_self_convention)]
+
+#[cfg(not(any(feature = "std", feature = "no-std-float")))]
+compile_error!("You have to activate either the `std` or the `no-std-float` feature.");
+
+#[cfg(feature = "std")]
+extern crate std;
+
+extern crate alloc;
+
+mod alpha_runs;
+mod blend_mode;
+mod blitter;
+mod clip;
+mod color;
+mod edge;
+mod edge_builder;
+mod edge_clipper;
+mod fixed_point;
+mod line_clipper;
+mod math;
+mod path64;
+mod path_geometry;
+mod pipeline;
+mod pixmap;
+mod scan;
+mod shaders;
+mod wide;
+
+mod painter; // Keep it under `pixmap` for a better order in the docs.
+
+pub use blend_mode::BlendMode;
+pub use clip::ClipMask;
+pub use color::{Color, ColorU8, PremultipliedColor, PremultipliedColorU8};
+pub use color::{ALPHA_OPAQUE, ALPHA_TRANSPARENT, ALPHA_U8_OPAQUE, ALPHA_U8_TRANSPARENT};
+pub use painter::{FillRule, Paint};
+pub use pixmap::{Pixmap, PixmapMut, PixmapRef, BYTES_PER_PIXEL};
+pub use shaders::{FilterQuality, GradientStop, PixmapPaint, SpreadMode};
+pub use shaders::{LinearGradient, Pattern, RadialGradient, Shader};
+
+pub use tiny_skia_path::{IntRect, Point, Rect, Transform};
+pub use tiny_skia_path::{LineCap, LineJoin, Stroke, StrokeDash};
+pub use tiny_skia_path::{Path, PathBuilder, PathSegment, PathSegmentsIter};
+
+/// An integer length that is guarantee to be > 0
+type LengthU32 = core::num::NonZeroU32;
--- a/third-party/vendor/tiny-skia/src/line_clipper.rs
+++ b/third-party/vendor/tiny-skia/src/line_clipper.rs
@ -0,0 +1,308 @@
+// Copyright 2011 Google Inc.
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use crate::{Point, Rect};
+
+use tiny_skia_path::Scalar;
+
+pub const MAX_POINTS: usize = 4;
+
+/// Clip the line pts[0]...pts[1] against clip, ignoring segments that
+/// lie completely above or below the clip. For portions to the left or
+/// right, turn those into vertical line segments that are aligned to the
+/// edge of the clip.
+///
+/// Return the number of line segments that result, and store the end-points
+/// of those segments sequentially in lines as follows:
+///
+/// 1st segment: lines[0]..lines[1]
+/// 2nd segment: lines[1]..lines[2]
+/// 3rd segment: lines[2]..lines[3]
+pub fn clip<'a>(
+    src: &[Point; 2],
+    clip: &Rect,
+    can_cull_to_the_right: bool,
+    points: &'a mut [Point; MAX_POINTS],
+) -> &'a [Point] {
+    let (mut index0, mut index1) = if src[0].y < src[1].y { (0, 1) } else { (1, 0) };
+
+    // Check if we're completely clipped out in Y (above or below)
+
+    if src[index1].y <= clip.top() {
+        // we're above the clip
+        return &[];
+    }
+
+    if src[index0].y >= clip.bottom() {
+        // we're below the clip
+        return &[];
+    }
+
+    // Chop in Y to produce a single segment, stored in tmp[0..1]
+
+    let mut tmp = *src;
+
+    // now compute intersections
+    if src[index0].y < clip.top() {
+        tmp[index0] = Point::from_xy(sect_with_horizontal(src, clip.top()), clip.top());
+        debug_assert!(is_between_unsorted(tmp[index0].x, src[0].x, src[1].x));
+    }
+
+    if tmp[index1].y > clip.bottom() {
+        tmp[index1] = Point::from_xy(sect_with_horizontal(src, clip.bottom()), clip.bottom());
+        debug_assert!(is_between_unsorted(tmp[index1].x, src[0].x, src[1].x));
+    }
+
+    // Chop it into 1..3 segments that are wholly within the clip in X.
+
+    // temp storage for up to 3 segments
+    let mut result_storage = [Point::zero(); MAX_POINTS];
+    let mut line_count = 1;
+    let mut reverse;
+
+    if src[0].x < src[1].x {
+        index0 = 0;
+        index1 = 1;
+        reverse = false;
+    } else {
+        index0 = 1;
+        index1 = 0;
+        reverse = true;
+    }
+
+    let result: &[Point] = if tmp[index1].x <= clip.left() {
+        // wholly to the left
+        tmp[0].x = clip.left();
+        tmp[1].x = clip.left();
+        reverse = false;
+        &tmp
+    } else if tmp[index0].x >= clip.right() {
+        // wholly to the right
+        if can_cull_to_the_right {
+            return &[];
+        }
+
+        tmp[0].x = clip.right();
+        tmp[1].x = clip.right();
+        reverse = false;
+        &tmp
+    } else {
+        let mut offset = 0;
+
+        if tmp[index0].x < clip.left() {
+            result_storage[offset] = Point::from_xy(clip.left(), tmp[index0].y);
+            offset += 1;
+            result_storage[offset] =
+                Point::from_xy(clip.left(), sect_clamp_with_vertical(&tmp, clip.left()));
+            debug_assert!(is_between_unsorted(
+                result_storage[offset].y,
+                tmp[0].y,
+                tmp[1].y
+            ));
+        } else {
+            result_storage[offset] = tmp[index0];
+        }
+        offset += 1;
+
+        if tmp[index1].x > clip.right() {
+            result_storage[offset] =
+                Point::from_xy(clip.right(), sect_clamp_with_vertical(&tmp, clip.right()));
+            debug_assert!(is_between_unsorted(
+                result_storage[offset].y,
+                tmp[0].y,
+                tmp[1].y
+            ));
+            offset += 1;
+            result_storage[offset] = Point::from_xy(clip.right(), tmp[index1].y);
+        } else {
+            result_storage[offset] = tmp[index1];
+        }
+
+        line_count = offset;
+        &result_storage
+    };
+
+    // Now copy the results into the caller's lines[] parameter
+    if reverse {
+        // copy the pts in reverse order to maintain winding order
+        for i in 0..=line_count {
+            points[line_count - i] = result[i];
+        }
+    } else {
+        let len = line_count + 1;
+        points[0..len].copy_from_slice(&result[0..len]);
+    }
+
+    &points[0..line_count + 1]
+}
+
+/// Returns X coordinate of intersection with horizontal line at Y.
+fn sect_with_horizontal(src: &[Point; 2], y: f32) -> f32 {
+    let dy = src[1].y - src[0].y;
+    if dy.is_nearly_zero() {
+        src[0].x.ave(src[1].x)
+    } else {
+        // need the extra precision so we don't compute a value that exceeds
+        // our original limits
+        let x0 = f64::from(src[0].x);
+        let y0 = f64::from(src[0].y);
+        let x1 = f64::from(src[1].x);
+        let y1 = f64::from(src[1].y);
+        let result = x0 + (f64::from(y) - y0) * (x1 - x0) / (y1 - y0);
+
+        // The computed X value might still exceed [X0..X1] due to quantum flux
+        // when the doubles were added and subtracted, so we have to pin the
+        // answer :(
+        pin_unsorted_f64(result, x0, x1) as f32
+    }
+}
+
+/// Returns value between the two limits, where the limits are either ascending or descending.
+fn is_between_unsorted(value: f32, limit0: f32, limit1: f32) -> bool {
+    if limit0 < limit1 {
+        limit0 <= value && value <= limit1
+    } else {
+        limit1 <= value && value <= limit0
+    }
+}
+
+fn sect_clamp_with_vertical(src: &[Point; 2], x: f32) -> f32 {
+    let y = sect_with_vertical(src, x);
+    // Our caller expects y to be between src[0].y and src[1].y (unsorted), but due to the
+    // numerics of floats/doubles, we might have computed a value slightly outside of that,
+    // so we have to manually clamp afterwards.
+    // See skbug.com/7491
+    pin_unsorted_f32(y, src[0].y, src[1].y)
+}
+
+/// Returns Y coordinate of intersection with vertical line at X.
+fn sect_with_vertical(src: &[Point; 2], x: f32) -> f32 {
+    let dx = src[1].x - src[0].x;
+    if dx.is_nearly_zero() {
+        src[0].y.ave(src[1].y)
+    } else {
+        // need the extra precision so we don't compute a value that exceeds
+        // our original limits
+        let x0 = f64::from(src[0].x);
+        let y0 = f64::from(src[0].y);
+        let x1 = f64::from(src[1].x);
+        let y1 = f64::from(src[1].y);
+        let result = y0 + (f64::from(x) - x0) * (y1 - y0) / (x1 - x0);
+        result as f32
+    }
+}
+
+fn pin_unsorted_f32(value: f32, mut limit0: f32, mut limit1: f32) -> f32 {
+    if limit1 < limit0 {
+        core::mem::swap(&mut limit0, &mut limit1);
+    }
+    // now the limits are sorted
+    debug_assert!(limit0 <= limit1);
+
+    if value < limit0 {
+        limit0
+    } else if value > limit1 {
+        limit1
+    } else {
+        value
+    }
+}
+
+fn pin_unsorted_f64(value: f64, mut limit0: f64, mut limit1: f64) -> f64 {
+    if limit1 < limit0 {
+        core::mem::swap(&mut limit0, &mut limit1);
+    }
+    // now the limits are sorted
+    debug_assert!(limit0 <= limit1);
+
+    if value < limit0 {
+        limit0
+    } else if value > limit1 {
+        limit1
+    } else {
+        value
+    }
+}
+
+/// Intersect the line segment against the rect. If there is a non-empty
+/// resulting segment, return true and set dst[] to that segment. If not,
+/// return false and ignore dst[].
+///
+/// `clip` is specialized for scan-conversion, as it adds vertical
+/// segments on the sides to show where the line extended beyond the
+/// left or right sides. `intersect` does not.
+pub fn intersect(src: &[Point; 2], clip: &Rect, dst: &mut [Point; 2]) -> bool {
+    let bounds = Rect::from_ltrb(
+        src[0].x.min(src[1].x),
+        src[0].y.min(src[1].y),
+        src[0].x.max(src[1].x),
+        src[0].y.max(src[1].y),
+    );
+
+    if let Some(bounds) = bounds {
+        if contains_no_empty_check(clip, &bounds) {
+            dst.copy_from_slice(src);
+            return true;
+        }
+
+        // check for no overlap, and only permit coincident edges if the line
+        // and the edge are colinear
+        if nested_lt(bounds.right(), clip.left(), bounds.width())
+            || nested_lt(clip.right(), bounds.left(), bounds.width())
+            || nested_lt(bounds.bottom(), clip.top(), bounds.height())
+            || nested_lt(clip.bottom(), bounds.top(), bounds.height())
+        {
+            return false;
+        }
+    }
+
+    let (index0, index1) = if src[0].y < src[1].y { (0, 1) } else { (1, 0) };
+
+    let mut tmp = src.clone();
+
+    // now compute Y intersections
+    if tmp[index0].y < clip.top() {
+        tmp[index0] = Point::from_xy(sect_with_horizontal(src, clip.top()), clip.top());
+    }
+
+    if tmp[index1].y > clip.bottom() {
+        tmp[index1] = Point::from_xy(sect_with_horizontal(src, clip.bottom()), clip.bottom());
+    }
+
+    let (index0, index1) = if tmp[0].x < tmp[1].x { (0, 1) } else { (1, 0) };
+
+    // check for quick-reject in X again, now that we may have been chopped
+    if tmp[index1].x <= clip.left() || tmp[index0].x >= clip.right() {
+        // usually we will return false, but we don't if the line is vertical and coincident
+        // with the clip.
+        if tmp[0].x != tmp[1].x || tmp[0].x < clip.left() || tmp[0].x > clip.right() {
+            return false;
+        }
+    }
+
+    if tmp[index0].x < clip.left() {
+        tmp[index0] = Point::from_xy(clip.left(), sect_with_vertical(src, clip.left()));
+    }
+
+    if tmp[index1].x > clip.right() {
+        tmp[index1] = Point::from_xy(clip.right(), sect_with_vertical(src, clip.right()));
+    }
+
+    dst.copy_from_slice(&tmp);
+    true
+}
+
+fn nested_lt(a: f32, b: f32, dim: f32) -> bool {
+    a <= b && (a < b || dim > 0.0)
+}
+
+// returns true if outer contains inner, even if inner is empty.
+fn contains_no_empty_check(outer: &Rect, inner: &Rect) -> bool {
+    outer.left() <= inner.left()
+        && outer.top() <= inner.top()
+        && outer.right() >= inner.right()
+        && outer.bottom() >= inner.bottom()
+}
--- a/third-party/vendor/tiny-skia/src/math.rs
+++ b/third-party/vendor/tiny-skia/src/math.rs
@ -0,0 +1,22 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use crate::LengthU32;
+
+// Perfectly safe.
+pub const LENGTH_U32_ONE: LengthU32 = unsafe { LengthU32::new_unchecked(1) };
+
+pub fn left_shift(value: i32, shift: i32) -> i32 {
+    ((value as u32) << shift) as i32
+}
+
+pub fn left_shift64(value: i64, shift: i32) -> i64 {
+    ((value as u64) << shift) as i64
+}
+
+pub fn bound<T: Ord + Copy>(min: T, value: T, max: T) -> T {
+    max.min(value).max(min)
+}
--- a/third-party/vendor/tiny-skia/src/painter.rs
+++ b/third-party/vendor/tiny-skia/src/painter.rs
@ -0,0 +1,618 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use crate::*;
+
+use tiny_skia_path::{PathStroker, Scalar, ScreenIntRect, SCALAR_MAX};
+
+use crate::clip::SubClipMaskRef;
+use crate::pipeline::RasterPipelineBlitter;
+use crate::pixmap::SubPixmapMut;
+use crate::scan;
+
+#[cfg(all(not(feature = "std"), feature = "no-std-float"))]
+use tiny_skia_path::NoStdFloat;
+
+/// A path filling rule.
+#[derive(Copy, Clone, PartialEq, Debug)]
+pub enum FillRule {
+    /// Specifies that "inside" is computed by a non-zero sum of signed edge crossings.
+    Winding,
+    /// Specifies that "inside" is computed by an odd number of edge crossings.
+    EvenOdd,
+}
+
+impl Default for FillRule {
+    fn default() -> Self {
+        FillRule::Winding
+    }
+}
+
+/// Controls how a shape should be painted.
+#[derive(Clone, PartialEq, Debug)]
+pub struct Paint<'a> {
+    /// A paint shader.
+    ///
+    /// Default: black color
+    pub shader: Shader<'a>,
+
+    /// Paint blending mode.
+    ///
+    /// Default: SourceOver
+    pub blend_mode: BlendMode,
+
+    /// Enables anti-aliased painting.
+    ///
+    /// Default: false
+    pub anti_alias: bool,
+
+    /// Forces the high quality/precision rendering pipeline.
+    ///
+    /// `tiny-skia`, just like Skia, has two rendering pipelines:
+    /// one uses `f32` and another one uses `u16`. `u16` one is usually way faster,
+    /// but less precise. Which can lead to slight differences.
+    ///
+    /// By default, `tiny-skia` will choose the pipeline automatically,
+    /// depending on a blending mode and other parameters.
+    /// But you can force the high quality one using this flag.
+    ///
+    /// This feature is especially useful during testing.
+    ///
+    /// Unlike high quality pipeline, the low quality one doesn't support all
+    /// rendering stages, therefore we cannot force it like hq one.
+    ///
+    /// Default: false
+    pub force_hq_pipeline: bool,
+}
+
+impl Default for Paint<'_> {
+    fn default() -> Self {
+        Paint {
+            shader: Shader::SolidColor(Color::BLACK),
+            blend_mode: BlendMode::default(),
+            anti_alias: false,
+            force_hq_pipeline: false,
+        }
+    }
+}
+
+impl<'a> Paint<'a> {
+    /// Sets a paint source to a solid color.
+    pub fn set_color(&mut self, color: Color) {
+        self.shader = Shader::SolidColor(color);
+    }
+
+    /// Sets a paint source to a solid color.
+    ///
+    /// `self.shader = Shader::SolidColor(Color::from_rgba8(50, 127, 150, 200));` shorthand.
+    pub fn set_color_rgba8(&mut self, r: u8, g: u8, b: u8, a: u8) {
+        self.set_color(Color::from_rgba8(r, g, b, a))
+    }
+
+    /// Checks that the paint source is a solid color.
+    pub fn is_solid_color(&self) -> bool {
+        matches!(self.shader, Shader::SolidColor(_))
+    }
+}
+
+impl Pixmap {
+    /// Draws a filled rectangle onto the pixmap.
+    ///
+    /// See [`PixmapMut::fill_rect`](struct.PixmapMut.html#method.fill_rect) for details.
+    pub fn fill_rect(
+        &mut self,
+        rect: Rect,
+        paint: &Paint,
+        transform: Transform,
+        clip_mask: Option<&ClipMask>,
+    ) -> Option<()> {
+        self.as_mut().fill_rect(rect, paint, transform, clip_mask)
+    }
+
+    /// Draws a filled path onto the pixmap.
+    ///
+    /// See [`PixmapMut::fill_path`](struct.PixmapMut.html#method.fill_path) for details.
+    pub fn fill_path(
+        &mut self,
+        path: &Path,
+        paint: &Paint,
+        fill_rule: FillRule,
+        transform: Transform,
+        clip_mask: Option<&ClipMask>,
+    ) -> Option<()> {
+        self.as_mut()
+            .fill_path(path, paint, fill_rule, transform, clip_mask)
+    }
+
+    /// Strokes a path.
+    ///
+    /// See [`PixmapMut::stroke_path`](struct.PixmapMut.html#method.stroke_path) for details.
+    pub fn stroke_path(
+        &mut self,
+        path: &Path,
+        paint: &Paint,
+        stroke: &Stroke,
+        transform: Transform,
+        clip_mask: Option<&ClipMask>,
+    ) -> Option<()> {
+        self.as_mut()
+            .stroke_path(path, paint, stroke, transform, clip_mask)
+    }
+
+    /// Draws a `Pixmap` on top of the current `Pixmap`.
+    ///
+    /// See [`PixmapMut::draw_pixmap`](struct.PixmapMut.html#method.draw_pixmap) for details.
+    pub fn draw_pixmap(
+        &mut self,
+        x: i32,
+        y: i32,
+        pixmap: PixmapRef,
+        paint: &PixmapPaint,
+        transform: Transform,
+        clip_mask: Option<&ClipMask>,
+    ) -> Option<()> {
+        self.as_mut()
+            .draw_pixmap(x, y, pixmap, paint, transform, clip_mask)
+    }
+}
+
+impl PixmapMut<'_> {
+    /// Draws a filled rectangle onto the pixmap.
+    ///
+    /// This function is usually slower than filling a rectangular path,
+    /// but it produces better results. Mainly it doesn't suffer from weird
+    /// clipping of horizontal/vertical edges.
+    ///
+    /// Used mainly to render a pixmap onto a pixmap.
+    ///
+    /// Returns `None` when there is nothing to fill or in case of a numeric overflow.
+    pub fn fill_rect(
+        &mut self,
+        rect: Rect,
+        paint: &Paint,
+        transform: Transform,
+        clip_mask: Option<&ClipMask>,
+    ) -> Option<()> {
+        // TODO: we probably can use tiler for rect too
+        if transform.is_identity() && !DrawTiler::required(self.width(), self.height()) {
+            // TODO: ignore rects outside the pixmap
+
+            let clip = self.size().to_screen_int_rect(0, 0);
+
+            let clip_mask = clip_mask.map(|mask| mask.as_submask());
+            let mut subpix = self.as_subpixmap();
+            let mut blitter = RasterPipelineBlitter::new(paint, clip_mask, &mut subpix)?;
+
+            if paint.anti_alias {
+                scan::fill_rect_aa(&rect, &clip, &mut blitter)
+            } else {
+                scan::fill_rect(&rect, &clip, &mut blitter)
+            }
+        } else {
+            let path = PathBuilder::from_rect(rect);
+            self.fill_path(&path, paint, FillRule::Winding, transform, clip_mask)
+        }
+    }
+
+    /// Draws a filled path onto the pixmap.
+    ///
+    /// Returns `None` when there is nothing to fill or in case of a numeric overflow.
+    pub fn fill_path(
+        &mut self,
+        path: &Path,
+        paint: &Paint,
+        fill_rule: FillRule,
+        transform: Transform,
+        clip_mask: Option<&ClipMask>,
+    ) -> Option<()> {
+        if transform.is_identity() {
+            // This is sort of similar to SkDraw::drawPath
+
+            // Skip empty paths and horizontal/vertical lines.
+            let path_bounds = path.bounds();
+            if path_bounds.width().is_nearly_zero() || path_bounds.height().is_nearly_zero() {
+                return None;
+            }
+
+            if is_too_big_for_math(path) {
+                return None;
+            }
+
+            // TODO: ignore paths outside the pixmap
+
+            if let Some(tiler) = DrawTiler::new(self.width(), self.height()) {
+                let mut path = path.clone(); // TODO: avoid cloning
+                let mut paint = paint.clone();
+
+                for tile in tiler {
+                    let ts = Transform::from_translate(-(tile.x() as f32), -(tile.y() as f32));
+                    path = path.transform(ts)?;
+                    paint.shader.transform(ts);
+
+                    let clip_rect = tile.size().to_screen_int_rect(0, 0);
+                    let mut subpix = self.subpixmap(tile.to_int_rect())?;
+
+                    let submask = clip_mask.and_then(|mask| mask.submask(tile.to_int_rect()));
+                    let mut blitter = RasterPipelineBlitter::new(&paint, submask, &mut subpix)?;
+                    // We're ignoring "errors" here, because `fill_path` will return `None`
+                    // when rendering a tile that doesn't have a path on it.
+                    // Which is not an error in this case.
+                    if paint.anti_alias {
+                        scan::path_aa::fill_path(&path, fill_rule, &clip_rect, &mut blitter);
+                    } else {
+                        scan::path::fill_path(&path, fill_rule, &clip_rect, &mut blitter);
+                    }
+
+                    let ts = Transform::from_translate(tile.x() as f32, tile.y() as f32);
+                    path = path.transform(ts)?;
+                    paint.shader.transform(ts);
+                }
+
+                Some(())
+            } else {
+                let clip_rect = self.size().to_screen_int_rect(0, 0);
+                let submask = clip_mask.map(|mask| mask.as_submask());
+                let mut subpix = self.as_subpixmap();
+                let mut blitter = RasterPipelineBlitter::new(paint, submask, &mut subpix)?;
+                if paint.anti_alias {
+                    scan::path_aa::fill_path(path, fill_rule, &clip_rect, &mut blitter)
+                } else {
+                    scan::path::fill_path(path, fill_rule, &clip_rect, &mut blitter)
+                }
+            }
+        } else {
+            let path = path.clone().transform(transform)?;
+
+            let mut paint = paint.clone();
+            paint.shader.transform(transform);
+
+            self.fill_path(&path, &paint, fill_rule, Transform::identity(), clip_mask)
+        }
+    }
+
+    /// Strokes a path.
+    ///
+    /// Stroking is implemented using two separate algorithms:
+    ///
+    /// 1. If a stroke width is wider than 1px (after applying the transformation),
+    ///    a path will be converted into a stroked path and then filled using `fill_path`.
+    ///    Which means that we have to allocate a separate `Path`, that can be 2-3x larger
+    ///    then the original path.
+    /// 2. If a stroke width is thinner than 1px (after applying the transformation),
+    ///    we will use hairline stroking, which doesn't involve a separate path allocation.
+    ///
+    /// Also, if a `stroke` has a dash array, then path will be converted into
+    /// a dashed path first and then stroked. Which means a yet another allocation.
+    pub fn stroke_path(
+        &mut self,
+        path: &Path,
+        paint: &Paint,
+        stroke: &Stroke,
+        transform: Transform,
+        clip_mask: Option<&ClipMask>,
+    ) -> Option<()> {
+        if stroke.width < 0.0 {
+            return None;
+        }
+
+        let res_scale = PathStroker::compute_resolution_scale(&transform);
+
+        let dash_path;
+        let path = if let Some(ref dash) = stroke.dash {
+            dash_path = path.dash(dash, res_scale)?;
+            &dash_path
+        } else {
+            path
+        };
+
+        if let Some(coverage) = treat_as_hairline(paint, stroke, transform) {
+            let mut paint = paint.clone();
+            if coverage == 1.0 {
+                // No changes to the `paint`.
+            } else if paint.blend_mode.should_pre_scale_coverage() {
+                // This is the old technique, which we preserve for now so
+                // we don't change previous results (testing)
+                // the new way seems fine, its just (a tiny bit) different.
+                let scale = (coverage * 256.0) as i32;
+                let new_alpha = (255 * scale) >> 8;
+                paint.shader.apply_opacity(new_alpha as f32 / 255.0);
+            }
+
+            if let Some(tiler) = DrawTiler::new(self.width(), self.height()) {
+                let mut path = path.clone(); // TODO: avoid cloning
+                let mut paint = paint.clone();
+
+                if !transform.is_identity() {
+                    paint.shader.transform(transform);
+                    path = path.transform(transform)?;
+                }
+
+                for tile in tiler {
+                    let ts = Transform::from_translate(-(tile.x() as f32), -(tile.y() as f32));
+                    path = path.transform(ts)?;
+                    paint.shader.transform(ts);
+
+                    let mut subpix = self.subpixmap(tile.to_int_rect())?;
+                    let submask = clip_mask.and_then(|mask| mask.submask(tile.to_int_rect()));
+
+                    // We're ignoring "errors" here, because `stroke_hairline` will return `None`
+                    // when rendering a tile that doesn't have a path on it.
+                    // Which is not an error in this case.
+                    Self::stroke_hairline(&path, &paint, stroke.line_cap, submask, &mut subpix);
+
+                    let ts = Transform::from_translate(tile.x() as f32, tile.y() as f32);
+                    path = path.transform(ts)?;
+                    paint.shader.transform(ts);
+                }
+
+                Some(())
+            } else {
+                let subpix = &mut self.as_subpixmap();
+                let submask = clip_mask.map(|mask| mask.as_submask());
+                if !transform.is_identity() {
+                    paint.shader.transform(transform);
+
+                    let path = path.clone().transform(transform)?; // TODO: avoid clone
+                    Self::stroke_hairline(&path, &paint, stroke.line_cap, submask, subpix)
+                } else {
+                    Self::stroke_hairline(path, &paint, stroke.line_cap, submask, subpix)
+                }
+            }
+        } else {
+            let path = path.stroke(stroke, res_scale)?;
+            self.fill_path(&path, paint, FillRule::Winding, transform, clip_mask)
+        }
+    }
+
+    /// A stroking for paths with subpixel/hairline width.
+    fn stroke_hairline(
+        path: &Path,
+        paint: &Paint,
+        line_cap: LineCap,
+        clip_mask: Option<SubClipMaskRef>,
+        pixmap: &mut SubPixmapMut,
+    ) -> Option<()> {
+        let clip = pixmap.size.to_screen_int_rect(0, 0);
+
+        let mut blitter = RasterPipelineBlitter::new(paint, clip_mask, pixmap)?;
+
+        if paint.anti_alias {
+            scan::hairline_aa::stroke_path(path, line_cap, &clip, &mut blitter)
+        } else {
+            scan::hairline::stroke_path(path, line_cap, &clip, &mut blitter)
+        }
+    }
+
+    /// Draws a `Pixmap` on top of the current `Pixmap`.
+    ///
+    /// We basically filling a rectangle with a `pixmap` pattern.
+    pub fn draw_pixmap(
+        &mut self,
+        x: i32,
+        y: i32,
+        pixmap: PixmapRef,
+        paint: &PixmapPaint,
+        transform: Transform,
+        clip_mask: Option<&ClipMask>,
+    ) -> Option<()> {
+        let rect = pixmap.size().to_int_rect(x, y).to_rect();
+
+        // TODO: SkSpriteBlitter
+        // TODO: partially clipped
+        // TODO: clipped out
+
+        // Translate pattern as well as bounds.
+        let patt_transform = Transform::from_translate(x as f32, y as f32);
+
+        let paint = Paint {
+            shader: Pattern::new(
+                pixmap,
+                SpreadMode::Pad, // Pad, otherwise we will get weird borders overlap.
+                paint.quality,
+                paint.opacity,
+                patt_transform,
+            ),
+            blend_mode: paint.blend_mode,
+            anti_alias: false,        // Skia doesn't use it too.
+            force_hq_pipeline: false, // Pattern will use hq anyway.
+        };
+
+        self.fill_rect(rect, &paint, transform, clip_mask)
+    }
+}
+
+fn treat_as_hairline(paint: &Paint, stroke: &Stroke, mut ts: Transform) -> Option<f32> {
+    fn fast_len(p: Point) -> f32 {
+        let mut x = p.x.abs();
+        let mut y = p.y.abs();
+        if x < y {
+            core::mem::swap(&mut x, &mut y);
+        }
+
+        x + y.half()
+    }
+
+    debug_assert!(stroke.width >= 0.0);
+
+    if stroke.width == 0.0 {
+        return Some(1.0);
+    }
+
+    if !paint.anti_alias {
+        return None;
+    }
+
+    // We don't care about translate.
+    ts.tx = 0.0;
+    ts.ty = 0.0;
+
+    // We need to try to fake a thick-stroke with a modulated hairline.
+    let mut points = [
+        Point::from_xy(stroke.width, 0.0),
+        Point::from_xy(0.0, stroke.width),
+    ];
+    ts.map_points(&mut points);
+
+    let len0 = fast_len(points[0]);
+    let len1 = fast_len(points[1]);
+
+    if len0 <= 1.0 && len1 <= 1.0 {
+        return Some(len0.ave(len1));
+    }
+
+    None
+}
+
+/// Sometimes in the drawing pipeline, we have to perform math on path coordinates, even after
+/// the path is in device-coordinates. Tessellation and clipping are two examples. Usually this
+/// is pretty modest, but it can involve subtracting/adding coordinates, or multiplying by
+/// small constants (e.g. 2,3,4). To try to preflight issues where these optionations could turn
+/// finite path values into infinities (or NaNs), we allow the upper drawing code to reject
+/// the path if its bounds (in device coordinates) is too close to max float.
+fn is_too_big_for_math(path: &Path) -> bool {
+    // This value is just a guess. smaller is safer, but we don't want to reject largish paths
+    // that we don't have to.
+    const SCALE_DOWN_TO_ALLOW_FOR_SMALL_MULTIPLIES: f32 = 0.25;
+    const MAX: f32 = SCALAR_MAX * SCALE_DOWN_TO_ALLOW_FOR_SMALL_MULTIPLIES;
+
+    let b = path.bounds();
+
+    // use ! expression so we return true if bounds contains NaN
+    !(b.left() >= -MAX && b.top() >= -MAX && b.right() <= MAX && b.bottom() <= MAX)
+}
+
+/// Splits the target pixmap into a list of tiles.
+///
+/// Skia/tiny-skia uses a lot of fixed-point math during path rendering.
+/// Probably more for precision than performance.
+/// And our fixed-point types are limited by 8192 and 32768.
+/// Which means that we cannot render a path larger than 8192 onto a pixmap.
+/// When pixmap is smaller than 8192, the path will be automatically clipped anyway,
+/// but for large pixmaps we have to render in tiles.
+pub(crate) struct DrawTiler {
+    image_width: u32,
+    image_height: u32,
+    x_offset: u32,
+    y_offset: u32,
+    finished: bool,
+}
+
+impl DrawTiler {
+    // 8K is 1 too big, since 8K << supersample == 32768 which is too big for Fixed.
+    const MAX_DIMENSIONS: u32 = 8192 - 1;
+
+    fn required(image_width: u32, image_height: u32) -> bool {
+        image_width > Self::MAX_DIMENSIONS || image_height > Self::MAX_DIMENSIONS
+    }
+
+    pub(crate) fn new(image_width: u32, image_height: u32) -> Option<Self> {
+        if Self::required(image_width, image_height) {
+            Some(DrawTiler {
+                image_width,
+                image_height,
+                x_offset: 0,
+                y_offset: 0,
+                finished: false,
+            })
+        } else {
+            None
+        }
+    }
+}
+
+impl Iterator for DrawTiler {
+    type Item = ScreenIntRect;
+
+    fn next(&mut self) -> Option<Self::Item> {
+        if self.finished {
+            return None;
+        }
+
+        // TODO: iterate only over tiles that actually affected by the shape
+
+        if self.x_offset < self.image_width && self.y_offset < self.image_height {
+            let h = if self.y_offset < self.image_height {
+                (self.image_height - self.y_offset).min(Self::MAX_DIMENSIONS)
+            } else {
+                self.image_height
+            };
+
+            let r = ScreenIntRect::from_xywh(
+                self.x_offset,
+                self.y_offset,
+                (self.image_width - self.x_offset).min(Self::MAX_DIMENSIONS),
+                h,
+            );
+
+            self.x_offset += Self::MAX_DIMENSIONS;
+            if self.x_offset >= self.image_width {
+                self.x_offset = 0;
+                self.y_offset += Self::MAX_DIMENSIONS;
+            }
+
+            return r;
+        }
+
+        None
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    const MAX_DIM: u32 = DrawTiler::MAX_DIMENSIONS;
+
+    #[test]
+    fn skip() {
+        assert!(DrawTiler::new(100, 500).is_none());
+    }
+
+    #[test]
+    fn horizontal() {
+        let mut iter = DrawTiler::new(10000, 500).unwrap();
+        assert_eq!(iter.next(), ScreenIntRect::from_xywh(0, 0, MAX_DIM, 500));
+        assert_eq!(
+            iter.next(),
+            ScreenIntRect::from_xywh(MAX_DIM, 0, 10000 - MAX_DIM, 500)
+        );
+        assert_eq!(iter.next(), None);
+    }
+
+    #[test]
+    fn vertical() {
+        let mut iter = DrawTiler::new(500, 10000).unwrap();
+        assert_eq!(iter.next(), ScreenIntRect::from_xywh(0, 0, 500, MAX_DIM));
+        assert_eq!(
+            iter.next(),
+            ScreenIntRect::from_xywh(0, MAX_DIM, 500, 10000 - MAX_DIM)
+        );
+        assert_eq!(iter.next(), None);
+    }
+
+    #[test]
+    fn rect() {
+        let mut iter = DrawTiler::new(10000, 10000).unwrap();
+        // Row 1
+        assert_eq!(
+            iter.next(),
+            ScreenIntRect::from_xywh(0, 0, MAX_DIM, MAX_DIM)
+        );
+        assert_eq!(
+            iter.next(),
+            ScreenIntRect::from_xywh(MAX_DIM, 0, 10000 - MAX_DIM, MAX_DIM)
+        );
+        // Row 2
+        assert_eq!(
+            iter.next(),
+            ScreenIntRect::from_xywh(0, MAX_DIM, MAX_DIM, 10000 - MAX_DIM)
+        );
+        assert_eq!(
+            iter.next(),
+            ScreenIntRect::from_xywh(MAX_DIM, MAX_DIM, 10000 - MAX_DIM, 10000 - MAX_DIM)
+        );
+        assert_eq!(iter.next(), None);
+    }
+}
--- a/third-party/vendor/tiny-skia/src/path64/cubic64.rs
+++ b/third-party/vendor/tiny-skia/src/path64/cubic64.rs
@ -0,0 +1,437 @@
+// Copyright 2012 Google Inc.
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use super::point64::{Point64, SearchAxis};
+use super::quad64;
+use super::Scalar64;
+
+#[cfg(all(not(feature = "std"), feature = "no-std-float"))]
+use tiny_skia_path::NoStdFloat;
+
+pub const POINT_COUNT: usize = 4;
+const PI: f64 = 3.141592653589793;
+
+pub struct Cubic64Pair {
+    pub points: [Point64; 7],
+}
+
+pub struct Cubic64 {
+    pub points: [Point64; POINT_COUNT],
+}
+
+impl Cubic64 {
+    pub fn new(points: [Point64; POINT_COUNT]) -> Self {
+        Cubic64 { points }
+    }
+
+    pub fn as_f64_slice(&self) -> [f64; POINT_COUNT * 2] {
+        [
+            self.points[0].x,
+            self.points[0].y,
+            self.points[1].x,
+            self.points[1].y,
+            self.points[2].x,
+            self.points[2].y,
+            self.points[3].x,
+            self.points[3].y,
+        ]
+    }
+
+    pub fn point_at_t(&self, t: f64) -> Point64 {
+        if t == 0.0 {
+            return self.points[0];
+        }
+
+        if t == 1.0 {
+            return self.points[3];
+        }
+
+        let one_t = 1.0 - t;
+        let one_t2 = one_t * one_t;
+        let a = one_t2 * one_t;
+        let b = 3.0 * one_t2 * t;
+        let t2 = t * t;
+        let c = 3.0 * one_t * t2;
+        let d = t2 * t;
+        Point64::from_xy(
+            a * self.points[0].x
+                + b * self.points[1].x
+                + c * self.points[2].x
+                + d * self.points[3].x,
+            a * self.points[0].y
+                + b * self.points[1].y
+                + c * self.points[2].y
+                + d * self.points[3].y,
+        )
+    }
+
+    pub fn search_roots(
+        &self,
+        mut extrema: usize,
+        axis_intercept: f64,
+        x_axis: SearchAxis,
+        extreme_ts: &mut [f64; 6],
+        valid_roots: &mut [f64],
+    ) -> usize {
+        extrema += self.find_inflections(&mut extreme_ts[extrema..]);
+        extreme_ts[extrema] = 0.0;
+        extrema += 1;
+        extreme_ts[extrema] = 1.0;
+        debug_assert!(extrema < 6);
+        extreme_ts[0..extrema].sort_by(cmp_f64);
+        let mut valid_count = 0;
+        let mut index = 0;
+        while index < extrema {
+            let min = extreme_ts[index];
+            index += 1;
+            let max = extreme_ts[index];
+            if min == max {
+                continue;
+            }
+
+            let new_t = self.binary_search(min, max, axis_intercept, x_axis);
+            if new_t >= 0.0 {
+                if valid_count >= 3 {
+                    return 0;
+                }
+
+                valid_roots[valid_count] = new_t;
+                valid_count += 1;
+            }
+        }
+
+        valid_count
+    }
+
+    fn find_inflections(&self, t_values: &mut [f64]) -> usize {
+        let ax = self.points[1].x - self.points[0].x;
+        let ay = self.points[1].y - self.points[0].y;
+        let bx = self.points[2].x - 2.0 * self.points[1].x + self.points[0].x;
+        let by = self.points[2].y - 2.0 * self.points[1].y + self.points[0].y;
+        let cx = self.points[3].x + 3.0 * (self.points[1].x - self.points[2].x) - self.points[0].x;
+        let cy = self.points[3].y + 3.0 * (self.points[1].y - self.points[2].y) - self.points[0].y;
+        quad64::roots_valid_t(
+            bx * cy - by * cx,
+            ax * cy - ay * cx,
+            ax * by - ay * bx,
+            t_values,
+        )
+    }
+
+    // give up when changing t no longer moves point
+    // also, copy point rather than recompute it when it does change
+    fn binary_search(&self, min: f64, max: f64, axis_intercept: f64, x_axis: SearchAxis) -> f64 {
+        let mut t = (min + max) / 2.0;
+        let mut step = (t - min) / 2.0;
+        let mut cubic_at_t = self.point_at_t(t);
+        let mut calc_pos = cubic_at_t.axis_coord(x_axis);
+        let mut calc_dist = calc_pos - axis_intercept;
+        loop {
+            let prior_t = min.max(t - step);
+            let less_pt = self.point_at_t(prior_t);
+            if less_pt.x.approximately_equal_half(cubic_at_t.x)
+                && less_pt.y.approximately_equal_half(cubic_at_t.y)
+            {
+                return -1.0; // binary search found no point at this axis intercept
+            }
+
+            let less_dist = less_pt.axis_coord(x_axis) - axis_intercept;
+            let last_step = step;
+            step /= 2.0;
+            let ok = if calc_dist > 0.0 {
+                calc_dist > less_dist
+            } else {
+                calc_dist < less_dist
+            };
+            if ok {
+                t = prior_t;
+            } else {
+                let next_t = t + last_step;
+                if next_t > max {
+                    return -1.0;
+                }
+
+                let more_pt = self.point_at_t(next_t);
+                if more_pt.x.approximately_equal_half(cubic_at_t.x)
+                    && more_pt.y.approximately_equal_half(cubic_at_t.y)
+                {
+                    return -1.0; // binary search found no point at this axis intercept
+                }
+
+                let more_dist = more_pt.axis_coord(x_axis) - axis_intercept;
+                let ok = if calc_dist > 0.0 {
+                    calc_dist <= more_dist
+                } else {
+                    calc_dist >= more_dist
+                };
+                if ok {
+                    continue;
+                }
+
+                t = next_t;
+            }
+
+            let test_at_t = self.point_at_t(t);
+            cubic_at_t = test_at_t;
+            calc_pos = cubic_at_t.axis_coord(x_axis);
+            calc_dist = calc_pos - axis_intercept;
+
+            if calc_pos.approximately_equal(axis_intercept) {
+                break;
+            }
+        }
+
+        t
+    }
+
+    pub fn chop_at(&self, t: f64) -> Cubic64Pair {
+        let mut dst = [Point64::zero(); 7];
+        if t == 0.5 {
+            dst[0] = self.points[0];
+            dst[1].x = (self.points[0].x + self.points[1].x) / 2.0;
+            dst[1].y = (self.points[0].y + self.points[1].y) / 2.0;
+            dst[2].x = (self.points[0].x + 2.0 * self.points[1].x + self.points[2].x) / 4.0;
+            dst[2].y = (self.points[0].y + 2.0 * self.points[1].y + self.points[2].y) / 4.0;
+            dst[3].x =
+                (self.points[0].x + 3.0 * (self.points[1].x + self.points[2].x) + self.points[3].x)
+                    / 8.0;
+            dst[3].y =
+                (self.points[0].y + 3.0 * (self.points[1].y + self.points[2].y) + self.points[3].y)
+                    / 8.0;
+            dst[4].x = (self.points[1].x + 2.0 * self.points[2].x + self.points[3].x) / 4.0;
+            dst[4].y = (self.points[1].y + 2.0 * self.points[2].y + self.points[3].y) / 4.0;
+            dst[5].x = (self.points[2].x + self.points[3].x) / 2.0;
+            dst[5].y = (self.points[2].y + self.points[3].y) / 2.0;
+            dst[6] = self.points[3];
+
+            Cubic64Pair { points: dst }
+        } else {
+            interp_cubic_coords_x(&self.points, t, &mut dst);
+            interp_cubic_coords_y(&self.points, t, &mut dst);
+            Cubic64Pair { points: dst }
+        }
+    }
+}
+
+pub fn coefficients(src: &[f64]) -> (f64, f64, f64, f64) {
+    let mut a = src[6]; // d
+    let mut b = src[4] * 3.0; // 3*c
+    let mut c = src[2] * 3.0; // 3*b
+    let d = src[0]; // a
+    a -= d - c + b; // A =   -a + 3*b - 3*c + d
+    b += 3.0 * d - 2.0 * c; // B =  3*a - 6*b + 3*c
+    c -= 3.0 * d; // C = -3*a + 3*b
+
+    (a, b, c, d)
+}
+
+// from SkGeometry.cpp (and Numeric Solutions, 5.6)
+pub fn roots_valid_t(a: f64, b: f64, c: f64, d: f64, t: &mut [f64; 3]) -> usize {
+    let mut s = [0.0; 3];
+    let real_roots = roots_real(a, b, c, d, &mut s);
+    let mut found_roots = quad64::push_valid_ts(&s, real_roots, t);
+    'outer: for index in 0..real_roots {
+        let t_value = s[index];
+        if !t_value.approximately_one_or_less() && t_value.between(1.0, 1.00005) {
+            for idx2 in 0..found_roots {
+                if t[idx2].approximately_equal(1.0) {
+                    continue 'outer;
+                }
+            }
+
+            debug_assert!(found_roots < 3);
+            t[found_roots] = 1.0;
+            found_roots += 1;
+        } else if !t_value.approximately_zero_or_more() && t_value.between(-0.00005, 0.0) {
+            for idx2 in 0..found_roots {
+                if t[idx2].approximately_equal(0.0) {
+                    continue 'outer;
+                }
+            }
+
+            debug_assert!(found_roots < 3);
+            t[found_roots] = 0.0;
+            found_roots += 1;
+        }
+    }
+
+    found_roots
+}
+
+fn roots_real(a: f64, b: f64, c: f64, d: f64, s: &mut [f64; 3]) -> usize {
+    if a.approximately_zero()
+        && a.approximately_zero_when_compared_to(b)
+        && a.approximately_zero_when_compared_to(c)
+        && a.approximately_zero_when_compared_to(d)
+    {
+        // we're just a quadratic
+        return quad64::roots_real(b, c, d, s);
+    }
+
+    if d.approximately_zero_when_compared_to(a)
+        && d.approximately_zero_when_compared_to(b)
+        && d.approximately_zero_when_compared_to(c)
+    {
+        // 0 is one root
+        let mut num = quad64::roots_real(a, b, c, s);
+        for i in 0..num {
+            if s[i].approximately_zero() {
+                return num;
+            }
+        }
+
+        s[num] = 0.0;
+        num += 1;
+
+        return num;
+    }
+
+    if (a + b + c + d).approximately_zero() {
+        // 1 is one root
+        let mut num = quad64::roots_real(a, a + b, -d, s);
+        for i in 0..num {
+            if s[i].almost_dequal_ulps(1.0) {
+                return num;
+            }
+        }
+        s[num] = 1.0;
+        num += 1;
+        return num;
+    }
+
+    let (a, b, c) = {
+        let inv_a = 1.0 / a;
+        let a = b * inv_a;
+        let b = c * inv_a;
+        let c = d * inv_a;
+        (a, b, c)
+    };
+
+    let a2 = a * a;
+    let q = (a2 - b * 3.0) / 9.0;
+    let r = (2.0 * a2 * a - 9.0 * a * b + 27.0 * c) / 54.0;
+    let r2 = r * r;
+    let q3 = q * q * q;
+    let r2_minus_q3 = r2 - q3;
+    let adiv3 = a / 3.0;
+    let mut offset = 0;
+    if r2_minus_q3 < 0.0 {
+        // we have 3 real roots
+
+        // the divide/root can, due to finite precisions, be slightly outside of -1...1
+        let theta = (r / q3.sqrt()).bound(-1.0, 1.0).acos();
+        let neg2_root_q = -2.0 * q.sqrt();
+
+        let mut rr = neg2_root_q * (theta / 3.0).cos() - adiv3;
+        s[offset] = rr;
+        offset += 1;
+
+        rr = neg2_root_q * ((theta + 2.0 * PI) / 3.0).cos() - adiv3;
+        if !s[0].almost_dequal_ulps(rr) {
+            s[offset] = rr;
+            offset += 1;
+        }
+
+        rr = neg2_root_q * ((theta - 2.0 * PI) / 3.0).cos() - adiv3;
+        if !s[0].almost_dequal_ulps(rr) && (offset == 1 || !s[1].almost_dequal_ulps(rr)) {
+            s[offset] = rr;
+            offset += 1;
+        }
+    } else {
+        // we have 1 real root
+        let sqrt_r2_minus_q3 = r2_minus_q3.sqrt();
+        let mut a = r.abs() + sqrt_r2_minus_q3;
+        a = super::cube_root(a);
+        if r > 0.0 {
+            a = -a;
+        }
+
+        if a != 0.0 {
+            a += q / a;
+        }
+
+        let mut r2 = a - adiv3;
+        s[offset] = r2;
+        offset += 1;
+        if r2.almost_dequal_ulps(q3) {
+            r2 = -a / 2.0 - adiv3;
+            if !s[0].almost_dequal_ulps(r2) {
+                s[offset] = r2;
+                offset += 1;
+            }
+        }
+    }
+
+    offset
+}
+
+// Cubic64'(t) = At^2 + Bt + C, where
+// A = 3(-a + 3(b - c) + d)
+// B = 6(a - 2b + c)
+// C = 3(b - a)
+// Solve for t, keeping only those that fit between 0 < t < 1
+pub fn find_extrema(src: &[f64], t_values: &mut [f64]) -> usize {
+    // we divide A,B,C by 3 to simplify
+    let a = src[0];
+    let b = src[2];
+    let c = src[4];
+    let d = src[6];
+    let a2 = d - a + 3.0 * (b - c);
+    let b2 = 2.0 * (a - b - b + c);
+    let c2 = b - a;
+
+    quad64::roots_valid_t(a2, b2, c2, t_values)
+}
+
+// Skia doesn't seems to care about NaN/inf during sorting, so we don't too.
+fn cmp_f64(a: &f64, b: &f64) -> core::cmp::Ordering {
+    if a < b {
+        core::cmp::Ordering::Less
+    } else if a > b {
+        core::cmp::Ordering::Greater
+    } else {
+        core::cmp::Ordering::Equal
+    }
+}
+
+// classic one t subdivision
+fn interp_cubic_coords_x(src: &[Point64; 4], t: f64, dst: &mut [Point64; 7]) {
+    use super::interp;
+
+    let ab = interp(src[0].x, src[1].x, t);
+    let bc = interp(src[1].x, src[2].x, t);
+    let cd = interp(src[2].x, src[3].x, t);
+    let abc = interp(ab, bc, t);
+    let bcd = interp(bc, cd, t);
+    let abcd = interp(abc, bcd, t);
+
+    dst[0].x = src[0].x;
+    dst[1].x = ab;
+    dst[2].x = abc;
+    dst[3].x = abcd;
+    dst[4].x = bcd;
+    dst[5].x = cd;
+    dst[6].x = src[3].x;
+}
+
+fn interp_cubic_coords_y(src: &[Point64; 4], t: f64, dst: &mut [Point64; 7]) {
+    use super::interp;
+
+    let ab = interp(src[0].y, src[1].y, t);
+    let bc = interp(src[1].y, src[2].y, t);
+    let cd = interp(src[2].y, src[3].y, t);
+    let abc = interp(ab, bc, t);
+    let bcd = interp(bc, cd, t);
+    let abcd = interp(abc, bcd, t);
+
+    dst[0].y = src[0].y;
+    dst[1].y = ab;
+    dst[2].y = abc;
+    dst[3].y = abcd;
+    dst[4].y = bcd;
+    dst[5].y = cd;
+    dst[6].y = src[3].y;
+}
--- a/third-party/vendor/tiny-skia/src/path64/line_cubic_intersections.rs
+++ b/third-party/vendor/tiny-skia/src/path64/line_cubic_intersections.rs
@ -0,0 +1,127 @@
+// Copyright 2012 Google Inc.
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+/*
+Find the intersection of a line and cubic by solving for valid t values.
+
+Analogous to line-quadratic intersection, solve line-cubic intersection by
+representing the cubic as:
+  x = a(1-t)^3 + 2b(1-t)^2t + c(1-t)t^2 + dt^3
+  y = e(1-t)^3 + 2f(1-t)^2t + g(1-t)t^2 + ht^3
+and the line as:
+  y = i*x + j  (if the line is more horizontal)
+or:
+  x = i*y + j  (if the line is more vertical)
+
+Then using Mathematica, solve for the values of t where the cubic intersects the
+line:
+
+  (in) Resultant[
+        a*(1 - t)^3 + 3*b*(1 - t)^2*t + 3*c*(1 - t)*t^2 + d*t^3 - x,
+        e*(1 - t)^3 + 3*f*(1 - t)^2*t + 3*g*(1 - t)*t^2 + h*t^3 - i*x - j, x]
+  (out) -e     +   j     +
+       3 e t   - 3 f t   -
+       3 e t^2 + 6 f t^2 - 3 g t^2 +
+         e t^3 - 3 f t^3 + 3 g t^3 - h t^3 +
+     i ( a     -
+       3 a t + 3 b t +
+       3 a t^2 - 6 b t^2 + 3 c t^2 -
+         a t^3 + 3 b t^3 - 3 c t^3 + d t^3 )
+
+if i goes to infinity, we can rewrite the line in terms of x. Mathematica:
+
+  (in) Resultant[
+        a*(1 - t)^3 + 3*b*(1 - t)^2*t + 3*c*(1 - t)*t^2 + d*t^3 - i*y - j,
+        e*(1 - t)^3 + 3*f*(1 - t)^2*t + 3*g*(1 - t)*t^2 + h*t^3 - y,       y]
+  (out)  a     -   j     -
+       3 a t   + 3 b t   +
+       3 a t^2 - 6 b t^2 + 3 c t^2 -
+         a t^3 + 3 b t^3 - 3 c t^3 + d t^3 -
+     i ( e     -
+       3 e t   + 3 f t   +
+       3 e t^2 - 6 f t^2 + 3 g t^2 -
+         e t^3 + 3 f t^3 - 3 g t^3 + h t^3 )
+
+Solving this with Mathematica produces an expression with hundreds of terms;
+instead, use Numeric Solutions recipe to solve the cubic.
+
+The near-horizontal case, in terms of:  Ax^3 + Bx^2 + Cx + D == 0
+    A =   (-(-e + 3*f - 3*g + h) + i*(-a + 3*b - 3*c + d)     )
+    B = 3*(-( e - 2*f +   g    ) + i*( a - 2*b +   c    )     )
+    C = 3*(-(-e +   f          ) + i*(-a +   b          )     )
+    D =   (-( e                ) + i*( a                ) + j )
+
+The near-vertical case, in terms of:  Ax^3 + Bx^2 + Cx + D == 0
+    A =   ( (-a + 3*b - 3*c + d) - i*(-e + 3*f - 3*g + h)     )
+    B = 3*( ( a - 2*b +   c    ) - i*( e - 2*f +   g    )     )
+    C = 3*( (-a +   b          ) - i*(-e +   f          )     )
+    D =   ( ( a                ) - i*( e                ) - j )
+
+For horizontal lines:
+(in) Resultant[
+      a*(1 - t)^3 + 3*b*(1 - t)^2*t + 3*c*(1 - t)*t^2 + d*t^3 - j,
+      e*(1 - t)^3 + 3*f*(1 - t)^2*t + 3*g*(1 - t)*t^2 + h*t^3 - y, y]
+(out)  e     -   j     -
+     3 e t   + 3 f t   +
+     3 e t^2 - 6 f t^2 + 3 g t^2 -
+       e t^3 + 3 f t^3 - 3 g t^3 + h t^3
+*/
+
+use super::cubic64::{self, Cubic64};
+use super::point64::SearchAxis;
+use super::Scalar64;
+
+pub fn horizontal_intersect(cubic: &Cubic64, axis_intercept: f64, roots: &mut [f64; 3]) -> usize {
+    let (a, b, c, mut d) = cubic64::coefficients(&cubic.as_f64_slice()[1..]);
+    d -= axis_intercept;
+    let mut count = cubic64::roots_valid_t(a, b, c, d, roots);
+    let mut index = 0;
+    while index < count {
+        let calc_pt = cubic.point_at_t(roots[index]);
+        if !calc_pt.y.approximately_equal(axis_intercept) {
+            let mut extreme_ts = [0.0; 6];
+            let extrema = cubic64::find_extrema(&cubic.as_f64_slice()[1..], &mut extreme_ts);
+            count = cubic.search_roots(
+                extrema,
+                axis_intercept,
+                SearchAxis::Y,
+                &mut extreme_ts,
+                roots,
+            );
+            break;
+        }
+
+        index += 1;
+    }
+
+    count
+}
+
+pub fn vertical_intersect(cubic: &Cubic64, axis_intercept: f64, roots: &mut [f64; 3]) -> usize {
+    let (a, b, c, mut d) = cubic64::coefficients(&cubic.as_f64_slice());
+    d -= axis_intercept;
+    let mut count = cubic64::roots_valid_t(a, b, c, d, roots);
+    let mut index = 0;
+    while index < count {
+        let calc_pt = cubic.point_at_t(roots[index]);
+        if !calc_pt.x.approximately_equal(axis_intercept) {
+            let mut extreme_ts = [0.0; 6];
+            let extrema = cubic64::find_extrema(&cubic.as_f64_slice(), &mut extreme_ts);
+            count = cubic.search_roots(
+                extrema,
+                axis_intercept,
+                SearchAxis::X,
+                &mut extreme_ts,
+                roots,
+            );
+            break;
+        }
+
+        index += 1;
+    }
+
+    count
+}
--- a/third-party/vendor/tiny-skia/src/path64/mod.rs
+++ b/third-party/vendor/tiny-skia/src/path64/mod.rs
@ -0,0 +1,151 @@
+// Copyright 2012 Google Inc.
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use tiny_skia_path::{Scalar, SCALAR_MAX};
+
+#[cfg(all(not(feature = "std"), feature = "no-std-float"))]
+use tiny_skia_path::NoStdFloat;
+
+// Must be first, because of macro scope rules.
+#[macro_use]
+pub mod point64;
+
+pub mod cubic64;
+pub mod line_cubic_intersections;
+mod quad64;
+
+// The code below is from SkPathOpsTypes.
+
+const DBL_EPSILON_ERR: f64 = f64::EPSILON * 4.0;
+const FLT_EPSILON_HALF: f64 = (f32::EPSILON / 2.0) as f64;
+const FLT_EPSILON_CUBED: f64 = (f32::EPSILON * f32::EPSILON * f32::EPSILON) as f64;
+const FLT_EPSILON_INVERSE: f64 = 1.0 / f32::EPSILON as f64;
+
+pub trait Scalar64 {
+    fn bound(self, min: Self, max: Self) -> Self;
+    fn between(self, a: f64, b: f64) -> bool;
+    fn precisely_zero(self) -> bool;
+    fn approximately_zero_or_more(self) -> bool;
+    fn approximately_one_or_less(self) -> bool;
+    fn approximately_zero(self) -> bool;
+    fn approximately_zero_inverse(self) -> bool;
+    fn approximately_zero_cubed(self) -> bool;
+    fn approximately_zero_half(self) -> bool;
+    fn approximately_zero_when_compared_to(self, other: Self) -> bool;
+    fn approximately_equal(self, other: Self) -> bool;
+    fn approximately_equal_half(self, other: Self) -> bool;
+    fn almost_dequal_ulps(self, other: Self) -> bool;
+}
+
+impl Scalar64 for f64 {
+    // Works just like SkTPin, returning `max` for NaN/inf
+    fn bound(self, min: Self, max: Self) -> Self {
+        max.min(self).max(min)
+    }
+
+    /// Returns true if (a <= self <= b) || (a >= self >= b).
+    fn between(self, a: f64, b: f64) -> bool {
+        debug_assert!(
+            ((a <= self && self <= b) || (a >= self && self >= b))
+                == ((a - self) * (b - self) <= 0.0)
+                || (a.precisely_zero() && self.precisely_zero() && b.precisely_zero())
+        );
+
+        (a - self) * (b - self) <= 0.0
+    }
+
+    fn precisely_zero(self) -> bool {
+        self.abs() < DBL_EPSILON_ERR
+    }
+
+    fn approximately_zero_or_more(self) -> bool {
+        self > -f64::EPSILON
+    }
+
+    fn approximately_one_or_less(self) -> bool {
+        self < 1.0 + f64::EPSILON
+    }
+
+    fn approximately_zero(self) -> bool {
+        self.abs() < f64::EPSILON
+    }
+
+    fn approximately_zero_inverse(self) -> bool {
+        self.abs() > FLT_EPSILON_INVERSE
+    }
+
+    fn approximately_zero_cubed(self) -> bool {
+        self.abs() < FLT_EPSILON_CUBED
+    }
+
+    fn approximately_zero_half(self) -> bool {
+        self < FLT_EPSILON_HALF
+    }
+
+    fn approximately_zero_when_compared_to(self, other: Self) -> bool {
+        self == 0.0 || self.abs() < (other * (f32::EPSILON as f64)).abs()
+    }
+
+    // Use this for comparing Ts in the range of 0 to 1. For general numbers (larger and smaller) use
+    // AlmostEqualUlps instead.
+    fn approximately_equal(self, other: Self) -> bool {
+        (self - other).approximately_zero()
+    }
+
+    fn approximately_equal_half(self, other: Self) -> bool {
+        (self - other).approximately_zero_half()
+    }
+
+    fn almost_dequal_ulps(self, other: Self) -> bool {
+        if self.abs() < SCALAR_MAX as f64 && other.abs() < SCALAR_MAX as f64 {
+            (self as f32).almost_dequal_ulps(other as f32)
+        } else {
+            (self - other).abs() / self.abs().max(other.abs()) < (f32::EPSILON * 16.0) as f64
+        }
+    }
+}
+
+pub fn cube_root(x: f64) -> f64 {
+    if x.approximately_zero_cubed() {
+        return 0.0;
+    }
+
+    let result = halley_cbrt3d(x.abs());
+    if x < 0.0 {
+        -result
+    } else {
+        result
+    }
+}
+
+// cube root approximation using 3 iterations of Halley's method (double)
+fn halley_cbrt3d(d: f64) -> f64 {
+    let mut a = cbrt_5d(d);
+    a = cbrta_halleyd(a, d);
+    a = cbrta_halleyd(a, d);
+    cbrta_halleyd(a, d)
+}
+
+// cube root approximation using bit hack for 64-bit float
+// adapted from Kahan's cbrt
+fn cbrt_5d(d: f64) -> f64 {
+    let b1 = 715094163;
+    let mut t: f64 = 0.0;
+    let pt: &mut [u32; 2] = bytemuck::cast_mut(&mut t);
+    let px: [u32; 2] = bytemuck::cast(d);
+    pt[1] = px[1] / 3 + b1;
+    t
+}
+
+// iterative cube root approximation using Halley's method (double)
+fn cbrta_halleyd(a: f64, r: f64) -> f64 {
+    let a3 = a * a * a;
+    a * (a3 + r + r) / (a3 + a3 + r)
+}
+
+fn interp(a: f64, b: f64, t: f64) -> f64 {
+    a + (b - a) * t
+}
--- a/third-party/vendor/tiny-skia/src/path64/point64.rs
+++ b/third-party/vendor/tiny-skia/src/path64/point64.rs
@ -0,0 +1,48 @@
+// Copyright 2012 Google Inc.
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use crate::Point;
+
+#[derive(Copy, Clone, PartialEq, Debug)]
+pub enum SearchAxis {
+    X,
+    Y,
+}
+
+#[repr(C)]
+#[derive(Copy, Clone, PartialEq, Default, Debug)]
+pub struct Point64 {
+    pub x: f64,
+    pub y: f64,
+}
+
+impl Point64 {
+    pub fn from_xy(x: f64, y: f64) -> Self {
+        Point64 { x, y }
+    }
+
+    pub fn from_point(p: Point) -> Self {
+        Point64 {
+            x: f64::from(p.x),
+            y: f64::from(p.y),
+        }
+    }
+
+    pub fn zero() -> Self {
+        Point64 { x: 0.0, y: 0.0 }
+    }
+
+    pub fn to_point(&self) -> Point {
+        Point::from_xy(self.x as f32, self.y as f32)
+    }
+
+    pub fn axis_coord(&self, axis: SearchAxis) -> f64 {
+        match axis {
+            SearchAxis::X => self.x,
+            SearchAxis::Y => self.y,
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/src/path64/quad64.rs
+++ b/third-party/vendor/tiny-skia/src/path64/quad64.rs
@ -0,0 +1,87 @@
+// Copyright 2012 Google Inc.
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use super::Scalar64;
+
+#[cfg(all(not(feature = "std"), feature = "no-std-float"))]
+use tiny_skia_path::NoStdFloat;
+
+pub fn push_valid_ts(s: &[f64], real_roots: usize, t: &mut [f64]) -> usize {
+    let mut found_roots = 0;
+    'outer: for index in 0..real_roots {
+        let mut t_value = s[index];
+        if t_value.approximately_zero_or_more() && t_value.approximately_one_or_less() {
+            t_value = t_value.bound(0.0, 1.0);
+
+            for idx2 in 0..found_roots {
+                if t[idx2].approximately_equal(t_value) {
+                    continue 'outer;
+                }
+            }
+
+            t[found_roots] = t_value;
+            found_roots += 1;
+        }
+    }
+
+    found_roots
+}
+
+// note: caller expects multiple results to be sorted smaller first
+// note: http://en.wikipedia.org/wiki/Loss_of_significance has an interesting
+//  analysis of the quadratic equation, suggesting why the following looks at
+//  the sign of B -- and further suggesting that the greatest loss of precision
+//  is in b squared less two a c
+pub fn roots_valid_t(a: f64, b: f64, c: f64, t: &mut [f64]) -> usize {
+    let mut s = [0.0; 3];
+    let real_roots = roots_real(a, b, c, &mut s);
+    push_valid_ts(&s, real_roots, t)
+}
+
+// Numeric Solutions (5.6) suggests to solve the quadratic by computing
+//     Q = -1/2(B + sgn(B)Sqrt(B^2 - 4 A C))
+// and using the roots
+//     t1 = Q / A
+//     t2 = C / Q
+//
+// this does not discard real roots <= 0 or >= 1
+pub fn roots_real(a: f64, b: f64, c: f64, s: &mut [f64; 3]) -> usize {
+    if a == 0.0 {
+        return handle_zero(b, c, s);
+    }
+
+    let p = b / (2.0 * a);
+    let q = c / a;
+    if a.approximately_zero() && (p.approximately_zero_inverse() || q.approximately_zero_inverse())
+    {
+        return handle_zero(b, c, s);
+    }
+
+    // normal form: x^2 + px + q = 0
+    let p2 = p * p;
+    if !p2.almost_dequal_ulps(q) && p2 < q {
+        return 0;
+    }
+
+    let mut sqrt_d = 0.0;
+    if p2 > q {
+        sqrt_d = (p2 - q).sqrt();
+    }
+
+    s[0] = sqrt_d - p;
+    s[1] = -sqrt_d - p;
+    1 + usize::from(!s[0].almost_dequal_ulps(s[1]))
+}
+
+fn handle_zero(b: f64, c: f64, s: &mut [f64; 3]) -> usize {
+    if b.approximately_zero() {
+        s[0] = 0.0;
+        (c == 0.0) as usize
+    } else {
+        s[0] = -c / b;
+        1
+    }
+}
--- a/third-party/vendor/tiny-skia/src/path_geometry.rs
+++ b/third-party/vendor/tiny-skia/src/path_geometry.rs
@ -0,0 +1,311 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use tiny_skia_path::{NormalizedF32, NormalizedF32Exclusive, Point};
+
+#[cfg(all(not(feature = "std"), feature = "no-std-float"))]
+use tiny_skia_path::NoStdFloat;
+
+pub use tiny_skia_path::path_geometry::{
+    chop_cubic_at2, chop_quad_at, find_cubic_max_curvature, find_unit_quad_roots, new_t_values,
+    CubicCoeff, QuadCoeff,
+};
+
+use tiny_skia_path::path_geometry::valid_unit_divide;
+
+// TODO: return custom type
+/// Returns 0 for 1 quad, and 1 for two quads, either way the answer is stored in dst[].
+///
+/// Guarantees that the 1/2 quads will be monotonic.
+pub fn chop_quad_at_x_extrema(src: &[Point; 3], dst: &mut [Point; 5]) -> usize {
+    let a = src[0].x;
+    let mut b = src[1].x;
+    let c = src[2].x;
+
+    if is_not_monotonic(a, b, c) {
+        if let Some(t_value) = valid_unit_divide(a - b, a - b - b + c) {
+            chop_quad_at(src, t_value, dst);
+
+            // flatten double quad extrema
+            dst[1].x = dst[2].x;
+            dst[3].x = dst[2].x;
+
+            return 1;
+        }
+
+        // if we get here, we need to force dst to be monotonic, even though
+        // we couldn't compute a unit_divide value (probably underflow).
+        b = if (a - b).abs() < (b - c).abs() { a } else { c };
+    }
+
+    dst[0] = Point::from_xy(a, src[0].y);
+    dst[1] = Point::from_xy(b, src[1].y);
+    dst[2] = Point::from_xy(c, src[2].y);
+    0
+}
+
+/// Returns 0 for 1 quad, and 1 for two quads, either way the answer is stored in dst[].
+///
+/// Guarantees that the 1/2 quads will be monotonic.
+pub fn chop_quad_at_y_extrema(src: &[Point; 3], dst: &mut [Point; 5]) -> usize {
+    let a = src[0].y;
+    let mut b = src[1].y;
+    let c = src[2].y;
+
+    if is_not_monotonic(a, b, c) {
+        if let Some(t_value) = valid_unit_divide(a - b, a - b - b + c) {
+            chop_quad_at(src, t_value, dst);
+
+            // flatten double quad extrema
+            dst[1].y = dst[2].y;
+            dst[3].y = dst[2].y;
+
+            return 1;
+        }
+
+        // if we get here, we need to force dst to be monotonic, even though
+        // we couldn't compute a unit_divide value (probably underflow).
+        b = if (a - b).abs() < (b - c).abs() { a } else { c };
+    }
+
+    dst[0] = Point::from_xy(src[0].x, a);
+    dst[1] = Point::from_xy(src[1].x, b);
+    dst[2] = Point::from_xy(src[2].x, c);
+    0
+}
+
+fn is_not_monotonic(a: f32, b: f32, c: f32) -> bool {
+    let ab = a - b;
+    let mut bc = b - c;
+    if ab < 0.0 {
+        bc = -bc;
+    }
+
+    ab == 0.0 || bc < 0.0
+}
+
+pub fn chop_cubic_at_x_extrema(src: &[Point; 4], dst: &mut [Point; 10]) -> usize {
+    let mut t_values = new_t_values();
+    let t_values = find_cubic_extrema(src[0].x, src[1].x, src[2].x, src[3].x, &mut t_values);
+
+    chop_cubic_at(src, t_values, dst);
+    if !t_values.is_empty() {
+        // we do some cleanup to ensure our X extrema are flat
+        dst[2].x = dst[3].x;
+        dst[4].x = dst[3].x;
+        if t_values.len() == 2 {
+            dst[5].x = dst[6].x;
+            dst[7].x = dst[6].x;
+        }
+    }
+
+    t_values.len()
+}
+
+/// Given 4 points on a cubic bezier, chop it into 1, 2, 3 beziers such that
+/// the resulting beziers are monotonic in Y.
+///
+/// This is called by the scan converter.
+///
+/// Depending on what is returned, dst[] is treated as follows:
+///
+/// - 0: dst[0..3] is the original cubic
+/// - 1: dst[0..3] and dst[3..6] are the two new cubics
+/// - 2: dst[0..3], dst[3..6], dst[6..9] are the three new cubics
+pub fn chop_cubic_at_y_extrema(src: &[Point; 4], dst: &mut [Point; 10]) -> usize {
+    let mut t_values = new_t_values();
+    let t_values = find_cubic_extrema(src[0].y, src[1].y, src[2].y, src[3].y, &mut t_values);
+
+    chop_cubic_at(src, t_values, dst);
+    if !t_values.is_empty() {
+        // we do some cleanup to ensure our Y extrema are flat
+        dst[2].y = dst[3].y;
+        dst[4].y = dst[3].y;
+        if t_values.len() == 2 {
+            dst[5].y = dst[6].y;
+            dst[7].y = dst[6].y;
+        }
+    }
+
+    t_values.len()
+}
+
+// Cubic'(t) = At^2 + Bt + C, where
+// A = 3(-a + 3(b - c) + d)
+// B = 6(a - 2b + c)
+// C = 3(b - a)
+// Solve for t, keeping only those that fit between 0 < t < 1
+fn find_cubic_extrema(
+    a: f32,
+    b: f32,
+    c: f32,
+    d: f32,
+    t_values: &mut [NormalizedF32Exclusive; 3],
+) -> &[NormalizedF32Exclusive] {
+    // we divide A,B,C by 3 to simplify
+    let na = d - a + 3.0 * (b - c);
+    let nb = 2.0 * (a - b - b + c);
+    let nc = b - a;
+
+    let roots = find_unit_quad_roots(na, nb, nc, t_values);
+    &t_values[0..roots]
+}
+
+// http://code.google.com/p/skia/issues/detail?id=32
+//
+// This test code would fail when we didn't check the return result of
+// valid_unit_divide in SkChopCubicAt(... NormalizedF32Exclusives[], int roots). The reason is
+// that after the first chop, the parameters to valid_unit_divide are equal
+// (thanks to finite float precision and rounding in the subtracts). Thus
+// even though the 2nd NormalizedF32Exclusive looks < 1.0, after we renormalize it, we end
+// up with 1.0, hence the need to check and just return the last cubic as
+// a degenerate clump of 4 points in the same place.
+pub fn chop_cubic_at(src: &[Point; 4], t_values: &[NormalizedF32Exclusive], dst: &mut [Point]) {
+    if t_values.is_empty() {
+        // nothing to chop
+        dst[0] = src[0];
+        dst[1] = src[1];
+        dst[2] = src[2];
+        dst[3] = src[3];
+    } else {
+        let mut t = t_values[0];
+        let mut tmp = [Point::zero(); 4];
+
+        // Reduce the `src` lifetime, so we can use `src = &tmp` later.
+        let mut src = src;
+
+        let mut dst_offset = 0;
+        for i in 0..t_values.len() {
+            chop_cubic_at2(src, t, &mut dst[dst_offset..]);
+            if i == t_values.len() - 1 {
+                break;
+            }
+
+            dst_offset += 3;
+            // have src point to the remaining cubic (after the chop)
+            tmp[0] = dst[dst_offset + 0];
+            tmp[1] = dst[dst_offset + 1];
+            tmp[2] = dst[dst_offset + 2];
+            tmp[3] = dst[dst_offset + 3];
+            src = &tmp;
+
+            // watch out in case the renormalized t isn't in range
+            let n = valid_unit_divide(
+                t_values[i + 1].get() - t_values[i].get(),
+                1.0 - t_values[i].get(),
+            );
+
+            match n {
+                Some(n) => t = n,
+                None => {
+                    // if we can't, just create a degenerate cubic
+                    dst[dst_offset + 4] = src[3];
+                    dst[dst_offset + 5] = src[3];
+                    dst[dst_offset + 6] = src[3];
+                    break;
+                }
+            }
+        }
+    }
+}
+
+pub fn chop_cubic_at_max_curvature(
+    src: &[Point; 4],
+    t_values: &mut [NormalizedF32Exclusive; 3],
+    dst: &mut [Point],
+) -> usize {
+    let mut roots = [NormalizedF32::ZERO; 3];
+    let roots = find_cubic_max_curvature(src, &mut roots);
+
+    // Throw out values not inside 0..1.
+    let mut count = 0;
+    for root in roots {
+        if 0.0 < root.get() && root.get() < 1.0 {
+            t_values[count] = NormalizedF32Exclusive::new_bounded(root.get());
+            count += 1;
+        }
+    }
+
+    if count == 0 {
+        dst[0..4].copy_from_slice(src);
+    } else {
+        chop_cubic_at(src, &t_values[0..count], dst);
+    }
+
+    count + 1
+}
+
+pub fn chop_mono_cubic_at_x(src: &[Point; 4], x: f32, dst: &mut [Point; 7]) -> bool {
+    cubic_dchop_at_intercept(src, x, true, dst)
+}
+
+pub fn chop_mono_cubic_at_y(src: &[Point; 4], y: f32, dst: &mut [Point; 7]) -> bool {
+    cubic_dchop_at_intercept(src, y, false, dst)
+}
+
+fn cubic_dchop_at_intercept(
+    src: &[Point; 4],
+    intercept: f32,
+    is_vertical: bool,
+    dst: &mut [Point; 7],
+) -> bool {
+    use crate::path64::{cubic64::Cubic64, line_cubic_intersections, point64::Point64};
+
+    let src = [
+        Point64::from_point(src[0]),
+        Point64::from_point(src[1]),
+        Point64::from_point(src[2]),
+        Point64::from_point(src[3]),
+    ];
+
+    let cubic = Cubic64::new(src);
+    let mut roots = [0.0; 3];
+    let count = if is_vertical {
+        line_cubic_intersections::vertical_intersect(&cubic, f64::from(intercept), &mut roots)
+    } else {
+        line_cubic_intersections::horizontal_intersect(&cubic, f64::from(intercept), &mut roots)
+    };
+
+    if count > 0 {
+        let pair = cubic.chop_at(roots[0]);
+        for i in 0..7 {
+            dst[i] = pair.points[i].to_point();
+        }
+
+        true
+    } else {
+        false
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn chop_cubic_at_y_extrema_1() {
+        let src = [
+            Point::from_xy(10.0, 20.0),
+            Point::from_xy(67.0, 437.0),
+            Point::from_xy(298.0, 213.0),
+            Point::from_xy(401.0, 214.0),
+        ];
+
+        let mut dst = [Point::zero(); 10];
+        let n = chop_cubic_at_y_extrema(&src, &mut dst);
+        assert_eq!(n, 2);
+        assert_eq!(dst[0], Point::from_xy(10.0, 20.0));
+        assert_eq!(dst[1], Point::from_xy(37.508274, 221.24475));
+        assert_eq!(dst[2], Point::from_xy(105.541855, 273.19803));
+        assert_eq!(dst[3], Point::from_xy(180.15599, 273.19803));
+        assert_eq!(dst[4], Point::from_xy(259.80502, 273.19803));
+        assert_eq!(dst[5], Point::from_xy(346.9527, 213.99666));
+        assert_eq!(dst[6], Point::from_xy(400.30844, 213.99666));
+        assert_eq!(dst[7], Point::from_xy(400.53958, 213.99666));
+        assert_eq!(dst[8], Point::from_xy(400.7701, 213.99777));
+        assert_eq!(dst[9], Point::from_xy(401.0, 214.0));
+    }
+}
--- a/third-party/vendor/tiny-skia/src/pipeline/blitter.rs
+++ b/third-party/vendor/tiny-skia/src/pipeline/blitter.rs
@ -0,0 +1,309 @@
+// Copyright 2016 Google Inc.
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use tiny_skia_path::ScreenIntRect;
+
+use crate::{BlendMode, LengthU32, Paint, PixmapRef, PremultipliedColorU8, Shader};
+use crate::{ALPHA_U8_OPAQUE, ALPHA_U8_TRANSPARENT};
+
+use crate::alpha_runs::AlphaRun;
+use crate::blitter::{Blitter, Mask};
+use crate::clip::SubClipMaskRef;
+use crate::color::AlphaU8;
+use crate::math::LENGTH_U32_ONE;
+use crate::pipeline::{self, RasterPipeline, RasterPipelineBuilder};
+use crate::pixmap::SubPixmapMut;
+
+pub struct RasterPipelineBlitter<'a, 'b: 'a> {
+    clip_mask: Option<SubClipMaskRef<'a>>,
+    pixmap_src: PixmapRef<'a>,
+    pixmap: &'a mut SubPixmapMut<'b>,
+    memset2d_color: Option<PremultipliedColorU8>,
+    blit_anti_h_rp: RasterPipeline,
+    blit_rect_rp: RasterPipeline,
+    blit_mask_rp: RasterPipeline,
+}
+
+impl<'a, 'b: 'a> RasterPipelineBlitter<'a, 'b> {
+    pub fn new(
+        paint: &Paint<'a>,
+        clip_mask: Option<SubClipMaskRef<'a>>,
+        pixmap: &'a mut SubPixmapMut<'b>,
+    ) -> Option<Self> {
+        // Make sure that `clip_mask` has the same size as `pixmap`.
+        if let Some(mask) = clip_mask {
+            if mask.size.width() != pixmap.size.width()
+                || mask.size.height() != pixmap.size.height()
+            {
+                return None;
+            }
+        }
+
+        // Fast-reject.
+        // This is basically SkInterpretXfermode().
+        match paint.blend_mode {
+            // `Destination` keep the pixmap unchanged. Nothing to do here.
+            BlendMode::Destination => return None,
+            BlendMode::DestinationIn if paint.shader.is_opaque() && paint.is_solid_color() => {
+                return None
+            }
+            _ => {}
+        }
+
+        // We can strength-reduce SourceOver into Source when opaque.
+        let mut blend_mode = paint.blend_mode;
+        if paint.shader.is_opaque() && blend_mode == BlendMode::SourceOver && clip_mask.is_none() {
+            blend_mode = BlendMode::Source;
+        }
+
+        // When we're drawing a constant color in Source mode, we can sometimes just memset.
+        let mut memset2d_color = None;
+        if paint.is_solid_color() && blend_mode == BlendMode::Source && clip_mask.is_none() {
+            // Unlike Skia, our shader cannot be constant.
+            // Therefore there is no need to run a raster pipeline to get shader's color.
+            if let Shader::SolidColor(ref color) = paint.shader {
+                memset2d_color = Some(color.premultiply().to_color_u8());
+            }
+        };
+
+        // Clear is just a transparent color memset.
+        if blend_mode == BlendMode::Clear && !paint.anti_alias && clip_mask.is_none() {
+            blend_mode = BlendMode::Source;
+            memset2d_color = Some(PremultipliedColorU8::TRANSPARENT);
+        }
+
+        let blit_anti_h_rp = {
+            let mut p = RasterPipelineBuilder::new();
+            p.set_force_hq_pipeline(paint.force_hq_pipeline);
+            paint.shader.push_stages(&mut p);
+
+            if clip_mask.is_some() {
+                p.push(pipeline::Stage::MaskU8);
+            }
+
+            if blend_mode.should_pre_scale_coverage() {
+                p.push(pipeline::Stage::Scale1Float);
+                p.push(pipeline::Stage::LoadDestination);
+                if let Some(blend_stage) = blend_mode.to_stage() {
+                    p.push(blend_stage);
+                }
+            } else {
+                p.push(pipeline::Stage::LoadDestination);
+                if let Some(blend_stage) = blend_mode.to_stage() {
+                    p.push(blend_stage);
+                }
+
+                p.push(pipeline::Stage::Lerp1Float);
+            }
+
+            p.push(pipeline::Stage::Store);
+
+            p.compile()
+        };
+
+        let blit_rect_rp = {
+            let mut p = RasterPipelineBuilder::new();
+            p.set_force_hq_pipeline(paint.force_hq_pipeline);
+            paint.shader.push_stages(&mut p);
+
+            if clip_mask.is_some() {
+                p.push(pipeline::Stage::MaskU8);
+            }
+
+            if blend_mode == BlendMode::SourceOver && clip_mask.is_none() {
+                // TODO: ignore when dither_rate is non-zero
+                p.push(pipeline::Stage::SourceOverRgba);
+            } else {
+                if blend_mode != BlendMode::Source {
+                    p.push(pipeline::Stage::LoadDestination);
+                    if let Some(blend_stage) = blend_mode.to_stage() {
+                        p.push(blend_stage);
+                    }
+                }
+
+                p.push(pipeline::Stage::Store);
+            }
+
+            p.compile()
+        };
+
+        let blit_mask_rp = {
+            let mut p = RasterPipelineBuilder::new();
+            p.set_force_hq_pipeline(paint.force_hq_pipeline);
+            paint.shader.push_stages(&mut p);
+
+            if clip_mask.is_some() {
+                p.push(pipeline::Stage::MaskU8);
+            }
+
+            if blend_mode.should_pre_scale_coverage() {
+                p.push(pipeline::Stage::ScaleU8);
+                p.push(pipeline::Stage::LoadDestination);
+                if let Some(blend_stage) = blend_mode.to_stage() {
+                    p.push(blend_stage);
+                }
+            } else {
+                p.push(pipeline::Stage::LoadDestination);
+                if let Some(blend_stage) = blend_mode.to_stage() {
+                    p.push(blend_stage);
+                }
+
+                p.push(pipeline::Stage::LerpU8);
+            }
+
+            p.push(pipeline::Stage::Store);
+
+            p.compile()
+        };
+
+        let pixmap_src = match paint.shader {
+            Shader::Pattern(ref patt) => patt.pixmap,
+            // Just a dummy one.
+            _ => PixmapRef::from_bytes(&[0, 0, 0, 0], 1, 1).unwrap(),
+        };
+
+        Some(RasterPipelineBlitter {
+            clip_mask,
+            pixmap_src,
+            pixmap,
+            memset2d_color,
+            blit_anti_h_rp,
+            blit_rect_rp,
+            blit_mask_rp,
+        })
+    }
+}
+
+impl Blitter for RasterPipelineBlitter<'_, '_> {
+    fn blit_h(&mut self, x: u32, y: u32, width: LengthU32) {
+        let r = ScreenIntRect::from_xywh_safe(x, y, width, LENGTH_U32_ONE);
+        self.blit_rect(&r);
+    }
+
+    fn blit_anti_h(&mut self, mut x: u32, y: u32, aa: &mut [AlphaU8], runs: &mut [AlphaRun]) {
+        let clip_mask_ctx = self
+            .clip_mask
+            .map(|c| c.clip_mask_ctx())
+            .unwrap_or_default();
+
+        let mut aa_offset = 0;
+        let mut run_offset = 0;
+        let mut run_opt = runs[0];
+        while let Some(run) = run_opt {
+            let width = LengthU32::from(run);
+
+            match aa[aa_offset] {
+                ALPHA_U8_TRANSPARENT => {}
+                ALPHA_U8_OPAQUE => {
+                    self.blit_h(x, y, width);
+                }
+                alpha => {
+                    self.blit_anti_h_rp.ctx.current_coverage = alpha as f32 * (1.0 / 255.0);
+
+                    let rect = ScreenIntRect::from_xywh_safe(x, y, width, LENGTH_U32_ONE);
+                    self.blit_anti_h_rp.run(
+                        &rect,
+                        pipeline::AAMaskCtx::default(),
+                        clip_mask_ctx,
+                        self.pixmap_src,
+                        self.pixmap,
+                    );
+                }
+            }
+
+            x += width.get();
+            run_offset += usize::from(run.get());
+            aa_offset += usize::from(run.get());
+            run_opt = runs[run_offset];
+        }
+    }
+
+    fn blit_v(&mut self, x: u32, y: u32, height: LengthU32, alpha: AlphaU8) {
+        let bounds = ScreenIntRect::from_xywh_safe(x, y, LENGTH_U32_ONE, height);
+
+        let mask = Mask {
+            image: [alpha, alpha],
+            bounds,
+            row_bytes: 0, // so we reuse the 1 "row" for all of height
+        };
+
+        self.blit_mask(&mask, &bounds);
+    }
+
+    fn blit_anti_h2(&mut self, x: u32, y: u32, alpha0: AlphaU8, alpha1: AlphaU8) {
+        let bounds = ScreenIntRect::from_xywh(x, y, 2, 1).unwrap();
+
+        let mask = Mask {
+            image: [alpha0, alpha1],
+            bounds,
+            row_bytes: 2,
+        };
+
+        self.blit_mask(&mask, &bounds);
+    }
+
+    fn blit_anti_v2(&mut self, x: u32, y: u32, alpha0: AlphaU8, alpha1: AlphaU8) {
+        let bounds = ScreenIntRect::from_xywh(x, y, 1, 2).unwrap();
+
+        let mask = Mask {
+            image: [alpha0, alpha1],
+            bounds,
+            row_bytes: 1,
+        };
+
+        self.blit_mask(&mask, &bounds);
+    }
+
+    fn blit_rect(&mut self, rect: &ScreenIntRect) {
+        if let Some(c) = self.memset2d_color {
+            for y in 0..rect.height() {
+                let start = self
+                    .pixmap
+                    .offset(rect.x() as usize, (rect.y() + y) as usize);
+                let end = start + rect.width() as usize;
+                self.pixmap.pixels_mut()[start..end]
+                    .iter_mut()
+                    .for_each(|p| *p = c);
+            }
+
+            return;
+        }
+
+        let clip_mask_ctx = self
+            .clip_mask
+            .map(|c| c.clip_mask_ctx())
+            .unwrap_or_default();
+
+        self.blit_rect_rp.run(
+            rect,
+            pipeline::AAMaskCtx::default(),
+            clip_mask_ctx,
+            self.pixmap_src,
+            self.pixmap,
+        );
+    }
+
+    fn blit_mask(&mut self, mask: &Mask, clip: &ScreenIntRect) {
+        let aa_mask_ctx = pipeline::AAMaskCtx {
+            pixels: mask.image,
+            stride: mask.row_bytes,
+            shift: (mask.bounds.left() + mask.bounds.top() * mask.row_bytes) as usize,
+        };
+
+        let clip_mask_ctx = self
+            .clip_mask
+            .map(|c| c.clip_mask_ctx())
+            .unwrap_or_default();
+
+        self.blit_mask_rp.run(
+            clip,
+            aa_mask_ctx,
+            clip_mask_ctx,
+            self.pixmap_src,
+            self.pixmap,
+        );
+    }
+}
--- a/third-party/vendor/tiny-skia/src/pipeline/highp.rs
+++ b/third-party/vendor/tiny-skia/src/pipeline/highp.rs
--- a/third-party/vendor/tiny-skia/src/pipeline/lowp.rs
+++ b/third-party/vendor/tiny-skia/src/pipeline/lowp.rs
@ -0,0 +1,796 @@
+// Copyright 2018 Google Inc.
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+/*!
+A low precision raster pipeline implementation.
+
+A lowp pipeline uses u16 instead of f32 for math.
+Because of that, it doesn't implement stages that require high precision.
+The pipeline compiler will automatically decide which one to use.
+
+Skia uses u16x8 (128bit) types for a generic CPU and u16x16 (256bit) for modern x86 CPUs.
+But instead of explicit SIMD instructions, it mainly relies on clang's vector extensions.
+And since they are unavailable in Rust, we have to do everything manually.
+
+According to our benchmarks, a SIMD-accelerated u16x8 in Rust is almost 2x slower than in Skia.
+Not sure why. For example, there are no div instruction for u16x8, so we have to use
+a basic scalar version. Which means unnecessary load/store. No idea what clang does in this case.
+Surprisingly, a SIMD-accelerated u16x8 is even slower than a scalar one. Again, not sure why.
+
+Therefore we are using scalar u16x16 by default and relying on rustc/llvm auto vectorization instead.
+When targeting a generic CPU, we're just 5-10% slower than Skia. While u16x8 is 30-40% slower.
+And while `-C target-cpu=haswell` boosts our performance by around 25%,
+we are still 40-60% behind Skia built for Haswell.
+
+On ARM AArch64 the story is different and explicit SIMD make our code up to 2-3x faster.
+*/
+
+use tiny_skia_path::ScreenIntRect;
+
+use crate::PremultipliedColorU8;
+
+use crate::pixmap::SubPixmapMut;
+use crate::wide::{f32x8, u16x16, f32x16};
+
+pub const STAGE_WIDTH: usize = 16;
+
+pub type StageFn = fn(p: &mut Pipeline);
+
+pub struct Pipeline<'a, 'b: 'a> {
+    index: usize,
+    functions: &'a [StageFn],
+    pixmap: &'a mut SubPixmapMut<'b>,
+    clip_mask_ctx: super::ClipMaskCtx<'a>,
+    mask_ctx: super::AAMaskCtx,
+    ctx: &'a mut super::Context,
+    r: u16x16,
+    g: u16x16,
+    b: u16x16,
+    a: u16x16,
+    dr: u16x16,
+    dg: u16x16,
+    db: u16x16,
+    da: u16x16,
+    tail: usize,
+    dx: usize,
+    dy: usize,
+}
+
+impl Pipeline<'_, '_> {
+    #[inline(always)]
+    fn next_stage(&mut self) {
+        let next: fn(&mut Self) = self.functions[self.index];
+        self.index += 1;
+        next(self);
+    }
+}
+
+
+// Must be in the same order as raster_pipeline::Stage
+pub const STAGES: &[StageFn; super::STAGES_COUNT] = &[
+    move_source_to_destination,
+    move_destination_to_source,
+    null_fn, // Clamp0
+    null_fn, // ClampA
+    premultiply,
+    uniform_color,
+    seed_shader,
+    load_dst,
+    store,
+    null_fn, // Gather
+    mask_u8,
+    scale_u8,
+    lerp_u8,
+    scale_1_float,
+    lerp_1_float,
+    destination_atop,
+    destination_in,
+    destination_out,
+    destination_over,
+    source_atop,
+    source_in,
+    source_out,
+    source_over,
+    clear,
+    modulate,
+    multiply,
+    plus,
+    screen,
+    xor,
+    null_fn, // ColorBurn
+    null_fn, // ColorDodge
+    darken,
+    difference,
+    exclusion,
+    hard_light,
+    lighten,
+    overlay,
+    null_fn, // SoftLight
+    null_fn, // Hue
+    null_fn, // Saturation
+    null_fn, // Color
+    null_fn, // Luminosity
+    source_over_rgba,
+    transform,
+    null_fn, // Reflect
+    null_fn, // Repeat
+    null_fn, // Bilinear
+    null_fn, // Bicubic
+    pad_x1,
+    reflect_x1,
+    repeat_x1,
+    gradient,
+    evenly_spaced_2_stop_gradient,
+    xy_to_radius,
+    null_fn, // XYTo2PtConicalFocalOnCircle
+    null_fn, // XYTo2PtConicalWellBehaved
+    null_fn, // XYTo2PtConicalGreater
+    null_fn, // Mask2PtConicalDegenerates
+    null_fn, // ApplyVectorMask
+];
+
+pub fn fn_ptr(f: StageFn) -> *const () {
+    f as *const ()
+}
+
+pub fn fn_ptr_eq(f1: StageFn, f2: StageFn) -> bool {
+    core::ptr::eq(f1 as *const (), f2 as *const ())
+}
+
+#[inline(never)]
+pub fn start(
+    functions: &[StageFn],
+    functions_tail: &[StageFn],
+    rect: &ScreenIntRect,
+    mask_ctx: super::AAMaskCtx,
+    clip_mask_ctx: super::ClipMaskCtx,
+    ctx: &mut super::Context,
+    pixmap: &mut SubPixmapMut,
+) {
+    let mut p = Pipeline {
+        index: 0,
+        functions: &[],
+        pixmap,
+        clip_mask_ctx,
+        mask_ctx,
+        ctx,
+        r: u16x16::default(),
+        g: u16x16::default(),
+        b: u16x16::default(),
+        a: u16x16::default(),
+        dr: u16x16::default(),
+        dg: u16x16::default(),
+        db: u16x16::default(),
+        da: u16x16::default(),
+        tail: 0,
+        dx: 0,
+        dy: 0,
+    };
+
+    for y in rect.y()..rect.bottom() {
+        let mut x = rect.x() as usize;
+        let end = rect.right() as usize;
+
+        p.functions = functions;
+        while x + STAGE_WIDTH <= end {
+            p.index = 0;
+            p.dx = x;
+            p.dy = y as usize;
+            p.tail = STAGE_WIDTH;
+            p.next_stage();
+            x += STAGE_WIDTH;
+        }
+
+        if x != end {
+            p.index = 0;
+            p.functions = functions_tail;
+            p.dx = x;
+            p.dy = y as usize;
+            p.tail = end - x;
+            p.next_stage();
+        }
+    }
+}
+
+fn move_source_to_destination(p: &mut Pipeline) {
+    p.dr = p.r;
+    p.dg = p.g;
+    p.db = p.b;
+    p.da = p.a;
+
+    p.next_stage();
+}
+
+fn move_destination_to_source(p: &mut Pipeline) {
+    p.r = p.dr;
+    p.g = p.dg;
+    p.b = p.db;
+    p.a = p.da;
+
+    p.next_stage();
+}
+
+fn premultiply(p: &mut Pipeline) {
+    p.r = div255(p.r * p.a);
+    p.g = div255(p.g * p.a);
+    p.b = div255(p.b * p.a);
+
+    p.next_stage();
+}
+
+fn uniform_color(p: &mut Pipeline) {
+    let ctx = p.ctx.uniform_color;
+    p.r = u16x16::splat(ctx.rgba[0]);
+    p.g = u16x16::splat(ctx.rgba[1]);
+    p.b = u16x16::splat(ctx.rgba[2]);
+    p.a = u16x16::splat(ctx.rgba[3]);
+
+    p.next_stage();
+}
+
+fn seed_shader(p: &mut Pipeline) {
+    let iota = f32x16(
+        f32x8::from([0.5,  1.5,  2.5,  3.5,  4.5,  5.5,  6.5,  7.5]),
+        f32x8::from([8.5,  9.5, 10.5, 11.5, 12.5, 13.5, 14.5, 15.5]),
+    );
+
+    let x = f32x16::splat(p.dx as f32) + iota;
+    let y = f32x16::splat(p.dy as f32 + 0.5);
+    split(&x, &mut p.r, &mut p.g);
+    split(&y, &mut p.b, &mut p.a);
+
+    p.next_stage();
+}
+
+pub fn load_dst(p: &mut Pipeline) {
+    load_8888(p.pixmap.slice16_at_xy(p.dx, p.dy), &mut p.dr, &mut p.dg, &mut p.db, &mut p.da);
+    p.next_stage();
+}
+
+pub fn load_dst_tail(p: &mut Pipeline) {
+    load_8888_tail(p.tail, p.pixmap.slice_at_xy(p.dx, p.dy), &mut p.dr, &mut p.dg, &mut p.db, &mut p.da);
+    p.next_stage();
+}
+
+pub fn store(p: &mut Pipeline) {
+    store_8888(&p.r, &p.g, &p.b, &p.a, p.pixmap.slice16_at_xy(p.dx, p.dy));
+    p.next_stage();
+}
+
+pub fn store_tail(p: &mut Pipeline) {
+    store_8888_tail(&p.r, &p.g, &p.b, &p.a, p.tail, p.pixmap.slice_at_xy(p.dx, p.dy));
+    p.next_stage();
+}
+
+fn mask_u8(p: &mut Pipeline) {
+    let offset = p.clip_mask_ctx.offset(p.dx, p.dy);
+
+    let mut c = u16x16::default();
+    for i in 0..p.tail {
+        c.0[i] = u16::from(p.clip_mask_ctx.data[offset + i]);
+    }
+
+    if c == u16x16::default() {
+        return;
+    }
+
+    p.r = div255(p.r * c);
+    p.g = div255(p.g * c);
+    p.b = div255(p.b * c);
+    p.a = div255(p.a * c);
+
+    p.next_stage();
+}
+
+fn scale_u8(p: &mut Pipeline) {
+    // Load u8xTail and cast it to u16x16.
+    let data = p.mask_ctx.copy_at_xy(p.dx, p.dy, p.tail);
+    let c = u16x16([
+        u16::from(data[0]),
+        u16::from(data[1]),
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+    ]);
+
+    p.r = div255(p.r * c);
+    p.g = div255(p.g * c);
+    p.b = div255(p.b * c);
+    p.a = div255(p.a * c);
+
+    p.next_stage();
+}
+
+fn lerp_u8(p: &mut Pipeline) {
+    // Load u8xTail and cast it to u16x16.
+    let data = p.mask_ctx.copy_at_xy(p.dx, p.dy, p.tail);
+    let c = u16x16([
+        u16::from(data[0]),
+        u16::from(data[1]),
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+        0,
+    ]);
+
+    p.r = lerp(p.dr, p.r, c);
+    p.g = lerp(p.dg, p.g, c);
+    p.b = lerp(p.db, p.b, c);
+    p.a = lerp(p.da, p.a, c);
+
+    p.next_stage();
+}
+
+fn scale_1_float(p: &mut Pipeline) {
+    let c = from_float(p.ctx.current_coverage);
+    p.r = div255(p.r * c);
+    p.g = div255(p.g * c);
+    p.b = div255(p.b * c);
+    p.a = div255(p.a * c);
+
+    p.next_stage();
+}
+
+fn lerp_1_float(p: &mut Pipeline) {
+    let c = from_float(p.ctx.current_coverage);
+    p.r = lerp(p.dr, p.r, c);
+    p.g = lerp(p.dg, p.g, c);
+    p.b = lerp(p.db, p.b, c);
+    p.a = lerp(p.da, p.a, c);
+
+    p.next_stage();
+}
+
+macro_rules! blend_fn {
+    ($name:ident, $f:expr) => {
+        fn $name(p: &mut Pipeline) {
+            p.r = $f(p.r, p.dr, p.a, p.da);
+            p.g = $f(p.g, p.dg, p.a, p.da);
+            p.b = $f(p.b, p.db, p.a, p.da);
+            p.a = $f(p.a, p.da, p.a, p.da);
+
+            p.next_stage();
+        }
+    };
+}
+
+blend_fn!(clear,            |_, _,  _,  _| u16x16::splat(0));
+blend_fn!(source_atop,      |s, d, sa, da| div255(s * da + d * inv(sa)));
+blend_fn!(destination_atop, |s, d, sa, da| div255(d * sa + s * inv(da)));
+blend_fn!(source_in,        |s, _,  _, da| div255(s * da));
+blend_fn!(destination_in,   |_, d, sa,  _| div255(d * sa));
+blend_fn!(source_out,       |s, _,  _, da| div255(s * inv(da)));
+blend_fn!(destination_out,  |_, d, sa,  _| div255(d * inv(sa)));
+blend_fn!(source_over,      |s, d, sa,  _| s + div255(d * inv(sa)));
+blend_fn!(destination_over, |s, d,  _, da| d + div255(s * inv(da)));
+blend_fn!(modulate,         |s, d,  _,  _| div255(s * d));
+blend_fn!(multiply,         |s, d, sa, da| div255(s * inv(da) + d * inv(sa) + s * d));
+blend_fn!(screen,           |s, d,  _,  _| s + d - div255(s * d));
+blend_fn!(xor,              |s, d, sa, da| div255(s * inv(da) + d * inv(sa)));
+
+// Wants a type for some reason.
+blend_fn!(plus, |s: u16x16, d, _, _| (s + d).min(&u16x16::splat(255)));
+
+
+macro_rules! blend_fn2 {
+    ($name:ident, $f:expr) => {
+        fn $name(p: &mut Pipeline) {
+            // The same logic applied to color, and source_over for alpha.
+            p.r = $f(p.r, p.dr, p.a, p.da);
+            p.g = $f(p.g, p.dg, p.a, p.da);
+            p.b = $f(p.b, p.db, p.a, p.da);
+            p.a = p.a + div255(p.da * inv(p.a));
+
+            p.next_stage();
+        }
+    };
+}
+
+blend_fn2!(darken,      |s: u16x16, d, sa, da| s + d - div255((s * da).max(&(d * sa))));
+blend_fn2!(lighten,     |s: u16x16, d, sa, da| s + d - div255((s * da).min(&(d * sa))));
+blend_fn2!(exclusion,   |s: u16x16, d,  _,  _| s + d - u16x16::splat(2) * div255(s * d));
+
+blend_fn2!(difference,  |s: u16x16, d, sa, da|
+    s + d - u16x16::splat(2) * div255((s * da).min(&(d * sa))));
+
+blend_fn2!(hard_light, |s: u16x16, d: u16x16, sa, da| {
+    div255(s * inv(da) + d * inv(sa)
+        + (s+s).cmp_le(&sa).blend(
+            u16x16::splat(2) * s * d,
+            sa * da - u16x16::splat(2) * (sa-s)*(da-d)
+        )
+    )
+});
+
+blend_fn2!(overlay, |s: u16x16, d: u16x16, sa, da| {
+    div255(s * inv(da) + d * inv(sa)
+        + (d+d).cmp_le(&da).blend(
+            u16x16::splat(2) * s * d,
+            sa * da - u16x16::splat(2) * (sa-s)*(da-d)
+        )
+    )
+});
+
+pub fn source_over_rgba(p: &mut Pipeline) {
+    let pixels = p.pixmap.slice16_at_xy(p.dx, p.dy);
+    load_8888(pixels, &mut p.dr, &mut p.dg, &mut p.db, &mut p.da);
+    p.r = p.r + div255(p.dr * inv(p.a));
+    p.g = p.g + div255(p.dg * inv(p.a));
+    p.b = p.b + div255(p.db * inv(p.a));
+    p.a = p.a + div255(p.da * inv(p.a));
+    store_8888(&p.r, &p.g, &p.b, &p.a, pixels);
+
+    p.next_stage();
+}
+
+pub fn source_over_rgba_tail(p: &mut Pipeline) {
+    let pixels = p.pixmap.slice_at_xy(p.dx, p.dy);
+    load_8888_tail(p.tail, pixels, &mut p.dr, &mut p.dg, &mut p.db, &mut p.da);
+    p.r = p.r + div255(p.dr * inv(p.a));
+    p.g = p.g + div255(p.dg * inv(p.a));
+    p.b = p.b + div255(p.db * inv(p.a));
+    p.a = p.a + div255(p.da * inv(p.a));
+    store_8888_tail(&p.r, &p.g, &p.b, &p.a, p.tail, pixels);
+
+    p.next_stage();
+}
+
+fn transform(p: &mut Pipeline) {
+    let ts = &p.ctx.transform;
+
+    let x = join(&p.r, &p.g);
+    let y = join(&p.b, &p.a);
+
+    let nx = mad(x, f32x16::splat(ts.sx), mad(y, f32x16::splat(ts.kx), f32x16::splat(ts.tx)));
+    let ny = mad(x, f32x16::splat(ts.ky), mad(y, f32x16::splat(ts.sy), f32x16::splat(ts.ty)));
+
+    split(&nx, &mut p.r, &mut p.g);
+    split(&ny, &mut p.b, &mut p.a);
+
+    p.next_stage();
+}
+
+fn pad_x1(p: &mut Pipeline) {
+    let x = join(&p.r, &p.g);
+    let x = x.normalize();
+    split(&x, &mut p.r, &mut p.g);
+
+    p.next_stage();
+}
+
+fn reflect_x1(p: &mut Pipeline) {
+    let x = join(&p.r, &p.g);
+    let two = |x| x + x;
+    let x = (
+        (x - f32x16::splat(1.0))
+        - two(((x - f32x16::splat(1.0)) * f32x16::splat(0.5)).floor())
+        - f32x16::splat(1.0)
+    ).abs().normalize();
+    split(&x, &mut p.r, &mut p.g);
+
+    p.next_stage();
+}
+
+fn repeat_x1(p: &mut Pipeline) {
+    let x = join(&p.r, &p.g);
+    let x = (x - x.floor()).normalize();
+    split(&x, &mut p.r, &mut p.g);
+
+    p.next_stage();
+}
+
+fn gradient(p: &mut Pipeline) {
+    let ctx = &p.ctx.gradient;
+
+    // N.B. The loop starts at 1 because idx 0 is the color to use before the first stop.
+    let t = join(&p.r, &p.g);
+    let mut idx = u16x16::splat(0);
+    for i in 1..ctx.len {
+        let tt = ctx.t_values[i].get();
+        let t0: [f32; 8] = t.0.into();
+        let t1: [f32; 8] = t.1.into();
+        idx.0[ 0] += (t0[0] >= tt) as u16;
+        idx.0[ 1] += (t0[1] >= tt) as u16;
+        idx.0[ 2] += (t0[2] >= tt) as u16;
+        idx.0[ 3] += (t0[3] >= tt) as u16;
+        idx.0[ 4] += (t0[4] >= tt) as u16;
+        idx.0[ 5] += (t0[5] >= tt) as u16;
+        idx.0[ 6] += (t0[6] >= tt) as u16;
+        idx.0[ 7] += (t0[7] >= tt) as u16;
+        idx.0[ 8] += (t1[0] >= tt) as u16;
+        idx.0[ 9] += (t1[1] >= tt) as u16;
+        idx.0[10] += (t1[2] >= tt) as u16;
+        idx.0[11] += (t1[3] >= tt) as u16;
+        idx.0[12] += (t1[4] >= tt) as u16;
+        idx.0[13] += (t1[5] >= tt) as u16;
+        idx.0[14] += (t1[6] >= tt) as u16;
+        idx.0[15] += (t1[7] >= tt) as u16;
+    }
+    gradient_lookup(ctx, &idx, t, &mut p.r, &mut p.g, &mut p.b, &mut p.a);
+
+    p.next_stage();
+}
+
+fn evenly_spaced_2_stop_gradient(p: &mut Pipeline) {
+    let ctx = &p.ctx.evenly_spaced_2_stop_gradient;
+
+    let t = join(&p.r, &p.g);
+    round_f32_to_u16(
+        mad(t, f32x16::splat(ctx.factor.r), f32x16::splat(ctx.bias.r)),
+        mad(t, f32x16::splat(ctx.factor.g), f32x16::splat(ctx.bias.g)),
+        mad(t, f32x16::splat(ctx.factor.b), f32x16::splat(ctx.bias.b)),
+        mad(t, f32x16::splat(ctx.factor.a), f32x16::splat(ctx.bias.a)),
+        &mut p.r, &mut p.g, &mut p.b, &mut p.a,
+    );
+
+    p.next_stage();
+}
+
+fn xy_to_radius(p: &mut Pipeline) {
+    let x = join(&p.r, &p.g);
+    let y = join(&p.b, &p.a);
+    let x = (x*x + y*y).sqrt();
+    split(&x, &mut p.r, &mut p.g);
+    split(&y, &mut p.b, &mut p.a);
+
+    p.next_stage();
+}
+
+// We are using u16 for index, not u32 as Skia, to simplify the code a bit.
+// The gradient creation code will not allow that many stops anyway.
+fn gradient_lookup(
+    ctx: &super::GradientCtx, idx: &u16x16, t: f32x16,
+    r: &mut u16x16, g: &mut u16x16, b: &mut u16x16, a: &mut u16x16,
+) {
+    macro_rules! gather {
+        ($d:expr, $c:ident) => {
+            // Surprisingly, but bound checking doesn't affect the performance.
+            // And since `idx` can contain any number, we should leave it in place.
+            f32x16(
+                f32x8::from([
+                    $d[idx.0[ 0] as usize].$c,
+                    $d[idx.0[ 1] as usize].$c,
+                    $d[idx.0[ 2] as usize].$c,
+                    $d[idx.0[ 3] as usize].$c,
+                    $d[idx.0[ 4] as usize].$c,
+                    $d[idx.0[ 5] as usize].$c,
+                    $d[idx.0[ 6] as usize].$c,
+                    $d[idx.0[ 7] as usize].$c,
+                ]),
+                f32x8::from([
+                    $d[idx.0[ 8] as usize].$c,
+                    $d[idx.0[ 9] as usize].$c,
+                    $d[idx.0[10] as usize].$c,
+                    $d[idx.0[11] as usize].$c,
+                    $d[idx.0[12] as usize].$c,
+                    $d[idx.0[13] as usize].$c,
+                    $d[idx.0[14] as usize].$c,
+                    $d[idx.0[15] as usize].$c,
+                ]),
+            )
+        };
+    }
+
+    let fr = gather!(&ctx.factors, r);
+    let fg = gather!(&ctx.factors, g);
+    let fb = gather!(&ctx.factors, b);
+    let fa = gather!(&ctx.factors, a);
+
+    let br = gather!(&ctx.biases, r);
+    let bg = gather!(&ctx.biases, g);
+    let bb = gather!(&ctx.biases, b);
+    let ba = gather!(&ctx.biases, a);
+
+    round_f32_to_u16(
+        mad(t, fr, br),
+        mad(t, fg, bg),
+        mad(t, fb, bb),
+        mad(t, fa, ba),
+        r, g, b, a,
+    );
+}
+
+#[inline(always)]
+fn round_f32_to_u16(
+    rf: f32x16, gf: f32x16, bf: f32x16, af: f32x16,
+    r: &mut u16x16, g: &mut u16x16, b: &mut u16x16, a: &mut u16x16,
+) {
+    // TODO: may produce a slightly different result to Skia
+    //       affects the two_stops_linear_mirror test
+
+    let rf = rf.normalize() * f32x16::splat(255.0) + f32x16::splat(0.5);
+    let gf = gf.normalize() * f32x16::splat(255.0) + f32x16::splat(0.5);
+    let bf = bf.normalize() * f32x16::splat(255.0) + f32x16::splat(0.5);
+    let af = af * f32x16::splat(255.0) + f32x16::splat(0.5);
+
+    rf.save_to_u16x16(r);
+    gf.save_to_u16x16(g);
+    bf.save_to_u16x16(b);
+    af.save_to_u16x16(a);
+}
+
+pub fn just_return(_: &mut Pipeline) {
+    // Ends the loop.
+}
+
+pub fn null_fn(_: &mut Pipeline) {
+    // Just for unsupported functions in STAGES.
+}
+
+#[inline(always)]
+fn load_8888(
+    data: &[PremultipliedColorU8; STAGE_WIDTH],
+    r: &mut u16x16, g: &mut u16x16, b: &mut u16x16, a: &mut u16x16,
+) {
+    *r = u16x16([
+        data[ 0].red() as u16, data[ 1].red() as u16, data[ 2].red() as u16, data[ 3].red() as u16,
+        data[ 4].red() as u16, data[ 5].red() as u16, data[ 6].red() as u16, data[ 7].red() as u16,
+        data[ 8].red() as u16, data[ 9].red() as u16, data[10].red() as u16, data[11].red() as u16,
+        data[12].red() as u16, data[13].red() as u16, data[14].red() as u16, data[15].red() as u16,
+    ]);
+
+    *g = u16x16([
+        data[ 0].green() as u16, data[ 1].green() as u16, data[ 2].green() as u16, data[ 3].green() as u16,
+        data[ 4].green() as u16, data[ 5].green() as u16, data[ 6].green() as u16, data[ 7].green() as u16,
+        data[ 8].green() as u16, data[ 9].green() as u16, data[10].green() as u16, data[11].green() as u16,
+        data[12].green() as u16, data[13].green() as u16, data[14].green() as u16, data[15].green() as u16,
+    ]);
+
+    *b = u16x16([
+        data[ 0].blue() as u16, data[ 1].blue() as u16, data[ 2].blue() as u16, data[ 3].blue() as u16,
+        data[ 4].blue() as u16, data[ 5].blue() as u16, data[ 6].blue() as u16, data[ 7].blue() as u16,
+        data[ 8].blue() as u16, data[ 9].blue() as u16, data[10].blue() as u16, data[11].blue() as u16,
+        data[12].blue() as u16, data[13].blue() as u16, data[14].blue() as u16, data[15].blue() as u16,
+    ]);
+
+    *a = u16x16([
+        data[ 0].alpha() as u16, data[ 1].alpha() as u16, data[ 2].alpha() as u16, data[ 3].alpha() as u16,
+        data[ 4].alpha() as u16, data[ 5].alpha() as u16, data[ 6].alpha() as u16, data[ 7].alpha() as u16,
+        data[ 8].alpha() as u16, data[ 9].alpha() as u16, data[10].alpha() as u16, data[11].alpha() as u16,
+        data[12].alpha() as u16, data[13].alpha() as u16, data[14].alpha() as u16, data[15].alpha() as u16,
+    ]);
+}
+
+#[inline(always)]
+fn load_8888_tail(
+    tail: usize, data: &[PremultipliedColorU8],
+    r: &mut u16x16, g: &mut u16x16, b: &mut u16x16, a: &mut u16x16,
+) {
+    // Fill a dummy array with `tail` values. `tail` is always in a 1..STAGE_WIDTH-1 range.
+    // This way we can reuse the `load_8888__` method and remove any branches.
+    let mut tmp = [PremultipliedColorU8::TRANSPARENT; STAGE_WIDTH];
+    tmp[0..tail].copy_from_slice(&data[0..tail]);
+    load_8888(&tmp, r, g, b, a);
+}
+
+#[inline(always)]
+fn store_8888(
+    r: &u16x16, g: &u16x16, b: &u16x16, a: &u16x16,
+    data: &mut [PremultipliedColorU8; STAGE_WIDTH],
+) {
+    let r = r.as_slice();
+    let g = g.as_slice();
+    let b = b.as_slice();
+    let a = a.as_slice();
+
+    data[ 0] = PremultipliedColorU8::from_rgba_unchecked(r[ 0] as u8, g[ 0] as u8, b[ 0] as u8, a[ 0] as u8);
+    data[ 1] = PremultipliedColorU8::from_rgba_unchecked(r[ 1] as u8, g[ 1] as u8, b[ 1] as u8, a[ 1] as u8);
+    data[ 2] = PremultipliedColorU8::from_rgba_unchecked(r[ 2] as u8, g[ 2] as u8, b[ 2] as u8, a[ 2] as u8);
+    data[ 3] = PremultipliedColorU8::from_rgba_unchecked(r[ 3] as u8, g[ 3] as u8, b[ 3] as u8, a[ 3] as u8);
+    data[ 4] = PremultipliedColorU8::from_rgba_unchecked(r[ 4] as u8, g[ 4] as u8, b[ 4] as u8, a[ 4] as u8);
+    data[ 5] = PremultipliedColorU8::from_rgba_unchecked(r[ 5] as u8, g[ 5] as u8, b[ 5] as u8, a[ 5] as u8);
+    data[ 6] = PremultipliedColorU8::from_rgba_unchecked(r[ 6] as u8, g[ 6] as u8, b[ 6] as u8, a[ 6] as u8);
+    data[ 7] = PremultipliedColorU8::from_rgba_unchecked(r[ 7] as u8, g[ 7] as u8, b[ 7] as u8, a[ 7] as u8);
+    data[ 8] = PremultipliedColorU8::from_rgba_unchecked(r[ 8] as u8, g[ 8] as u8, b[ 8] as u8, a[ 8] as u8);
+    data[ 9] = PremultipliedColorU8::from_rgba_unchecked(r[ 9] as u8, g[ 9] as u8, b[ 9] as u8, a[ 9] as u8);
+    data[10] = PremultipliedColorU8::from_rgba_unchecked(r[10] as u8, g[10] as u8, b[10] as u8, a[10] as u8);
+    data[11] = PremultipliedColorU8::from_rgba_unchecked(r[11] as u8, g[11] as u8, b[11] as u8, a[11] as u8);
+    data[12] = PremultipliedColorU8::from_rgba_unchecked(r[12] as u8, g[12] as u8, b[12] as u8, a[12] as u8);
+    data[13] = PremultipliedColorU8::from_rgba_unchecked(r[13] as u8, g[13] as u8, b[13] as u8, a[13] as u8);
+    data[14] = PremultipliedColorU8::from_rgba_unchecked(r[14] as u8, g[14] as u8, b[14] as u8, a[14] as u8);
+    data[15] = PremultipliedColorU8::from_rgba_unchecked(r[15] as u8, g[15] as u8, b[15] as u8, a[15] as u8);
+}
+
+#[inline(always)]
+fn store_8888_tail(
+    r: &u16x16, g: &u16x16, b: &u16x16, a: &u16x16,
+    tail: usize, data: &mut [PremultipliedColorU8],
+) {
+    let r = r.as_slice();
+    let g = g.as_slice();
+    let b = b.as_slice();
+    let a = a.as_slice();
+
+    // This is better than `for i in 0..tail`, because this way the compiler
+    // knows that we have only 16 steps and slices access is guarantee to be valid.
+    // This removes bounds checking and a possible panic call.
+    for i in 0..STAGE_WIDTH {
+        data[i] = PremultipliedColorU8::from_rgba_unchecked(
+            r[i] as u8, g[i] as u8, b[i] as u8, a[i] as u8,
+        );
+
+        if i + 1 == tail {
+            break;
+        }
+    }
+}
+
+#[inline(always)]
+fn div255(v: u16x16) -> u16x16 {
+    // Skia uses `vrshrq_n_u16(vrsraq_n_u16(v, v, 8), 8)` here when NEON is available,
+    // but it doesn't affect performance much and breaks reproducible result. Ignore it.
+    // NOTE: the compiler does not replace the devision with a shift.
+    (v + u16x16::splat(255)) >> u16x16::splat(8) // / u16x16::splat(256)
+}
+
+#[inline(always)]
+fn inv(v: u16x16) -> u16x16 {
+    u16x16::splat(255) - v
+}
+
+#[inline(always)]
+fn from_float(f: f32) -> u16x16 {
+    u16x16::splat((f * 255.0 + 0.5) as u16)
+}
+
+#[inline(always)]
+fn lerp(from: u16x16, to: u16x16, t: u16x16) -> u16x16 {
+    div255(from * inv(t) + to * t)
+}
+
+#[inline(always)]
+fn split(v: &f32x16, lo: &mut u16x16, hi: &mut u16x16) {
+    // We're splitting f32x16 (512bit) into two u16x16 (256 bit).
+    let data: [u8; 64] = bytemuck::cast(*v);
+    let d0: &mut [u8; 32] = bytemuck::cast_mut(&mut lo.0);
+    let d1: &mut [u8; 32] = bytemuck::cast_mut(&mut hi.0);
+
+    d0.copy_from_slice(&data[0..32]);
+    d1.copy_from_slice(&data[32..64]);
+}
+
+#[inline(always)]
+fn join(lo: &u16x16, hi: &u16x16) -> f32x16 {
+    // We're joining two u16x16 (256 bit) into f32x16 (512bit).
+
+    let d0: [u8; 32] = bytemuck::cast(lo.0);
+    let d1: [u8; 32] = bytemuck::cast(hi.0);
+
+    let mut v = f32x16::default();
+    let data: &mut [u8; 64] = bytemuck::cast_mut(&mut v);
+
+    data[0..32].copy_from_slice(&d0);
+    data[32..64].copy_from_slice(&d1);
+
+    v
+}
+
+#[inline(always)]
+fn mad(f: f32x16, m: f32x16, a: f32x16) -> f32x16 {
+    // NEON vmlaq_f32 doesn't seem to affect performance in any way. Ignore it.
+    f * m + a
+}
--- a/third-party/vendor/tiny-skia/src/pipeline/mod.rs
+++ b/third-party/vendor/tiny-skia/src/pipeline/mod.rs
@ -0,0 +1,623 @@
+// Copyright 2016 Google Inc.
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+/*!
+A raster pipeline implementation.
+
+Despite having a lot of changes compared to `SkRasterPipeline`,
+the core principles are the same:
+
+1. A pipeline consists of stages.
+1. A pipeline has a global context shared by all stages.
+   Unlike Skia, were each stage has it's own, possibly shared, context.
+1. Each stage has a high precision implementation. See `highp.rs`.
+1. Some stages have a low precision implementation. See `lowp.rs`.
+1. Each stage calls the "next" stage after its done.
+1. During pipeline "compilation", if **all** stages have a lowp implementation,
+   the lowp pipeline will be used. Otherwise, the highp variant will be used.
+1. The pipeline "compilation" produces a list of function pointer.
+   The last pointer is a pointer to the "return" function,
+   which simply stops the execution of the pipeline.
+
+This implementation is a bit tricky, but it gives the maximum performance.
+A simple and straightforward implementation using traits and loops, like:
+
+```ignore
+trait StageTrait {
+    fn apply(&mut self, pixels: &mut [Pixel]);
+}
+
+let stages: Vec<&mut dyn StageTrait>;
+for stage in stages {
+    stage.apply(pixels);
+}
+```
+
+will be at least 20-30% slower. Not really sure why.
+
+Also, since this module is all about performance, any kind of branching is
+strictly forbidden. All stage functions must not use `if`, `match` or loops.
+There are still some exceptions, which are basically an imperfect implementations
+and should be optimized out in the future.
+*/
+
+use alloc::vec::Vec;
+
+use arrayvec::ArrayVec;
+
+use tiny_skia_path::{NormalizedF32, ScreenIntRect};
+
+use crate::{Color, LengthU32, PremultipliedColor, PremultipliedColorU8, SpreadMode};
+use crate::{PixmapRef, Transform};
+
+pub use blitter::RasterPipelineBlitter;
+
+use crate::math::LENGTH_U32_ONE;
+use crate::pixmap::SubPixmapMut;
+use crate::wide::u32x8;
+
+mod blitter;
+#[rustfmt::skip] mod highp;
+#[rustfmt::skip] mod lowp;
+
+const MAX_STAGES: usize = 32; // More than enough.
+
+#[allow(dead_code)]
+#[derive(Copy, Clone, Debug)]
+pub enum Stage {
+    MoveSourceToDestination = 0,
+    MoveDestinationToSource,
+    Clamp0,
+    ClampA,
+    Premultiply,
+    UniformColor,
+    SeedShader,
+    LoadDestination,
+    Store,
+    Gather,
+    MaskU8,
+    ScaleU8,
+    LerpU8,
+    Scale1Float,
+    Lerp1Float,
+    DestinationAtop,
+    DestinationIn,
+    DestinationOut,
+    DestinationOver,
+    SourceAtop,
+    SourceIn,
+    SourceOut,
+    SourceOver,
+    Clear,
+    Modulate,
+    Multiply,
+    Plus,
+    Screen,
+    Xor,
+    ColorBurn,
+    ColorDodge,
+    Darken,
+    Difference,
+    Exclusion,
+    HardLight,
+    Lighten,
+    Overlay,
+    SoftLight,
+    Hue,
+    Saturation,
+    Color,
+    Luminosity,
+    SourceOverRgba,
+    Transform,
+    Reflect,
+    Repeat,
+    Bilinear,
+    Bicubic,
+    PadX1,
+    ReflectX1,
+    RepeatX1,
+    Gradient,
+    EvenlySpaced2StopGradient,
+    XYToRadius,
+    XYTo2PtConicalFocalOnCircle,
+    XYTo2PtConicalWellBehaved,
+    XYTo2PtConicalGreater,
+    Mask2PtConicalDegenerates,
+    ApplyVectorMask,
+}
+
+pub const STAGES_COUNT: usize = Stage::ApplyVectorMask as usize + 1;
+
+impl<'a> PixmapRef<'a> {
+    #[inline(always)]
+    pub(crate) fn gather(&self, index: u32x8) -> [PremultipliedColorU8; highp::STAGE_WIDTH] {
+        let index: [u32; 8] = bytemuck::cast(index);
+        let pixels = self.pixels();
+        [
+            pixels[index[0] as usize],
+            pixels[index[1] as usize],
+            pixels[index[2] as usize],
+            pixels[index[3] as usize],
+            pixels[index[4] as usize],
+            pixels[index[5] as usize],
+            pixels[index[6] as usize],
+            pixels[index[7] as usize],
+        ]
+    }
+}
+
+impl<'a> SubPixmapMut<'a> {
+    #[inline(always)]
+    pub(crate) fn offset(&self, dx: usize, dy: usize) -> usize {
+        self.real_width * dy + dx
+    }
+
+    #[inline(always)]
+    pub(crate) fn slice_at_xy(&mut self, dx: usize, dy: usize) -> &mut [PremultipliedColorU8] {
+        let offset = self.offset(dx, dy);
+        &mut self.pixels_mut()[offset..]
+    }
+
+    #[inline(always)]
+    pub(crate) fn slice4_at_xy(
+        &mut self,
+        dx: usize,
+        dy: usize,
+    ) -> &mut [PremultipliedColorU8; highp::STAGE_WIDTH] {
+        arrayref::array_mut_ref!(self.pixels_mut(), self.offset(dx, dy), highp::STAGE_WIDTH)
+    }
+
+    #[inline(always)]
+    pub(crate) fn slice16_at_xy(
+        &mut self,
+        dx: usize,
+        dy: usize,
+    ) -> &mut [PremultipliedColorU8; lowp::STAGE_WIDTH] {
+        arrayref::array_mut_ref!(self.pixels_mut(), self.offset(dx, dy), lowp::STAGE_WIDTH)
+    }
+}
+
+#[derive(Default, Debug)]
+pub struct AAMaskCtx {
+    pub pixels: [u8; 2],
+    pub stride: u32,  // can be zero
+    pub shift: usize, // mask offset/position in pixmap coordinates
+}
+
+impl AAMaskCtx {
+    #[inline(always)]
+    pub fn copy_at_xy(&self, dx: usize, dy: usize, tail: usize) -> [u8; 2] {
+        let offset = (self.stride as usize * dy + dx) - self.shift;
+        // We have only 3 variants, so unroll them.
+        match (offset, tail) {
+            (0, 1) => [self.pixels[0], 0],
+            (0, 2) => [self.pixels[0], self.pixels[1]],
+            (1, 1) => [self.pixels[1], 0],
+            _ => [0, 0], // unreachable
+        }
+    }
+}
+
+#[derive(Copy, Clone, Debug)]
+pub struct ClipMaskCtx<'a> {
+    pub data: &'a [u8],
+    pub stride: LengthU32,
+}
+
+impl Default for ClipMaskCtx<'_> {
+    fn default() -> Self {
+        ClipMaskCtx {
+            data: &[],
+            stride: LENGTH_U32_ONE,
+        }
+    }
+}
+
+impl ClipMaskCtx<'_> {
+    #[inline(always)]
+    fn offset(&self, dx: usize, dy: usize) -> usize {
+        self.stride.get() as usize * dy + dx
+    }
+}
+
+#[derive(Default)]
+pub struct Context {
+    pub current_coverage: f32,
+    pub sampler: SamplerCtx,
+    pub uniform_color: UniformColorCtx,
+    pub evenly_spaced_2_stop_gradient: EvenlySpaced2StopGradientCtx,
+    pub gradient: GradientCtx,
+    pub two_point_conical_gradient: TwoPointConicalGradientCtx,
+    pub limit_x: TileCtx,
+    pub limit_y: TileCtx,
+    pub transform: Transform,
+}
+
+#[derive(Copy, Clone, Default, Debug)]
+pub struct SamplerCtx {
+    pub spread_mode: SpreadMode,
+    pub inv_width: f32,
+    pub inv_height: f32,
+}
+
+#[derive(Copy, Clone, Default, Debug)]
+pub struct UniformColorCtx {
+    pub r: f32,
+    pub g: f32,
+    pub b: f32,
+    pub a: f32,
+    pub rgba: [u16; 4], // [0,255] in a 16-bit lane.
+}
+
+// A gradient color is an unpremultiplied RGBA not in a 0..1 range.
+// It basically can have any float value.
+#[derive(Copy, Clone, Default, Debug)]
+pub struct GradientColor {
+    pub r: f32,
+    pub g: f32,
+    pub b: f32,
+    pub a: f32,
+}
+
+impl GradientColor {
+    pub fn new(r: f32, g: f32, b: f32, a: f32) -> Self {
+        GradientColor { r, g, b, a }
+    }
+}
+
+impl From<Color> for GradientColor {
+    fn from(c: Color) -> Self {
+        GradientColor {
+            r: c.red(),
+            g: c.green(),
+            b: c.blue(),
+            a: c.alpha(),
+        }
+    }
+}
+
+#[derive(Copy, Clone, Default, Debug)]
+pub struct EvenlySpaced2StopGradientCtx {
+    pub factor: GradientColor,
+    pub bias: GradientColor,
+}
+
+#[derive(Clone, Default, Debug)]
+pub struct GradientCtx {
+    /// This value stores the actual colors count.
+    /// `factors` and `biases` must store at least 16 values,
+    /// since this is the length of a lowp pipeline stage.
+    /// So any any value past `len` is just zeros.
+    pub len: usize,
+    pub factors: Vec<GradientColor>,
+    pub biases: Vec<GradientColor>,
+    pub t_values: Vec<NormalizedF32>,
+}
+
+impl GradientCtx {
+    pub fn push_const_color(&mut self, color: GradientColor) {
+        self.factors.push(GradientColor::new(0.0, 0.0, 0.0, 0.0));
+        self.biases.push(color);
+    }
+}
+
+#[derive(Copy, Clone, Default, Debug)]
+pub struct TwoPointConicalGradientCtx {
+    // This context is used only in highp, where we use Tx4.
+    pub mask: u32x8,
+    pub p0: f32,
+}
+
+#[derive(Copy, Clone, Default, Debug)]
+pub struct TileCtx {
+    pub scale: f32,
+    pub inv_scale: f32, // cache of 1/scale
+}
+
+pub struct RasterPipelineBuilder {
+    stages: ArrayVec<Stage, MAX_STAGES>,
+    force_hq_pipeline: bool,
+    pub ctx: Context,
+}
+
+impl RasterPipelineBuilder {
+    pub fn new() -> Self {
+        RasterPipelineBuilder {
+            stages: ArrayVec::new(),
+            force_hq_pipeline: false,
+            ctx: Context::default(),
+        }
+    }
+
+    pub fn set_force_hq_pipeline(&mut self, hq: bool) {
+        self.force_hq_pipeline = hq;
+    }
+
+    pub fn push(&mut self, stage: Stage) {
+        self.stages.push(stage);
+    }
+
+    pub fn push_transform(&mut self, ts: Transform) {
+        if ts.is_finite() && !ts.is_identity() {
+            self.stages.push(Stage::Transform);
+            self.ctx.transform = ts;
+        }
+    }
+
+    pub fn push_uniform_color(&mut self, c: PremultipliedColor) {
+        let r = c.red();
+        let g = c.green();
+        let b = c.blue();
+        let a = c.alpha();
+        let rgba = [
+            (r * 255.0 + 0.5) as u16,
+            (g * 255.0 + 0.5) as u16,
+            (b * 255.0 + 0.5) as u16,
+            (a * 255.0 + 0.5) as u16,
+        ];
+
+        let ctx = UniformColorCtx { r, g, b, a, rgba };
+
+        self.stages.push(Stage::UniformColor);
+        self.ctx.uniform_color = ctx;
+    }
+
+    pub fn compile(self) -> RasterPipeline {
+        if self.stages.is_empty() {
+            return RasterPipeline {
+                kind: RasterPipelineKind::High {
+                    functions: ArrayVec::new(),
+                    tail_functions: ArrayVec::new(),
+                },
+                ctx: Context::default(),
+            };
+        }
+
+        let is_lowp_compatible = self
+            .stages
+            .iter()
+            .all(|stage| !lowp::fn_ptr_eq(lowp::STAGES[*stage as usize], lowp::null_fn));
+
+        if self.force_hq_pipeline || !is_lowp_compatible {
+            let mut functions: ArrayVec<_, MAX_STAGES> = self
+                .stages
+                .iter()
+                .map(|stage| highp::STAGES[*stage as usize] as highp::StageFn)
+                .collect();
+            functions.push(highp::just_return as highp::StageFn);
+
+            // I wasn't able to reproduce Skia's load_8888_/store_8888_ performance.
+            // Skia uses fallthrough switch, which is probably the reason.
+            // In Rust, any branching in load/store code drastically affects the performance.
+            // So instead, we're using two "programs": one for "full stages" and one for "tail stages".
+            // While the only difference is the load/store methods.
+            let mut tail_functions = functions.clone();
+            for fun in &mut tail_functions {
+                if highp::fn_ptr(*fun) == highp::fn_ptr(highp::load_dst) {
+                    *fun = highp::load_dst_tail as highp::StageFn;
+                } else if highp::fn_ptr(*fun) == highp::fn_ptr(highp::store) {
+                    *fun = highp::store_tail as highp::StageFn;
+                } else if highp::fn_ptr(*fun) == highp::fn_ptr(highp::source_over_rgba) {
+                    // SourceOverRgba calls load/store manually, without the pipeline,
+                    // therefore we have to switch it too.
+                    *fun = highp::source_over_rgba_tail as highp::StageFn;
+                }
+            }
+
+            RasterPipeline {
+                kind: RasterPipelineKind::High {
+                    functions,
+                    tail_functions,
+                },
+                ctx: self.ctx,
+            }
+        } else {
+            let mut functions: ArrayVec<_, MAX_STAGES> = self
+                .stages
+                .iter()
+                .map(|stage| lowp::STAGES[*stage as usize] as lowp::StageFn)
+                .collect();
+            functions.push(lowp::just_return as lowp::StageFn);
+
+            // See above.
+            let mut tail_functions = functions.clone();
+            for fun in &mut tail_functions {
+                if lowp::fn_ptr(*fun) == lowp::fn_ptr(lowp::load_dst) {
+                    *fun = lowp::load_dst_tail as lowp::StageFn;
+                } else if lowp::fn_ptr(*fun) == lowp::fn_ptr(lowp::store) {
+                    *fun = lowp::store_tail as lowp::StageFn;
+                } else if lowp::fn_ptr(*fun) == lowp::fn_ptr(lowp::source_over_rgba) {
+                    // SourceOverRgba calls load/store manually, without the pipeline,
+                    // therefore we have to switch it too.
+                    *fun = lowp::source_over_rgba_tail as lowp::StageFn;
+                }
+            }
+
+            RasterPipeline {
+                kind: RasterPipelineKind::Low {
+                    functions,
+                    tail_functions,
+                },
+                ctx: self.ctx,
+            }
+        }
+    }
+}
+
+pub enum RasterPipelineKind {
+    High {
+        functions: ArrayVec<highp::StageFn, MAX_STAGES>,
+        tail_functions: ArrayVec<highp::StageFn, MAX_STAGES>,
+    },
+    Low {
+        functions: ArrayVec<lowp::StageFn, MAX_STAGES>,
+        tail_functions: ArrayVec<lowp::StageFn, MAX_STAGES>,
+    },
+}
+
+pub struct RasterPipeline {
+    kind: RasterPipelineKind,
+    pub ctx: Context,
+}
+
+impl RasterPipeline {
+    pub fn run(
+        &mut self,
+        rect: &ScreenIntRect,
+        mask_ctx: AAMaskCtx,
+        clip_mask_ctx: ClipMaskCtx,
+        pixmap_src: PixmapRef,
+        pixmap_dst: &mut SubPixmapMut,
+    ) {
+        match self.kind {
+            RasterPipelineKind::High {
+                ref functions,
+                ref tail_functions,
+            } => {
+                highp::start(
+                    functions.as_slice(),
+                    tail_functions.as_slice(),
+                    rect,
+                    mask_ctx,
+                    clip_mask_ctx,
+                    &mut self.ctx,
+                    pixmap_src,
+                    pixmap_dst,
+                );
+            }
+            RasterPipelineKind::Low {
+                ref functions,
+                ref tail_functions,
+            } => {
+                lowp::start(
+                    functions.as_slice(),
+                    tail_functions.as_slice(),
+                    rect,
+                    mask_ctx,
+                    clip_mask_ctx,
+                    &mut self.ctx,
+                    // lowp doesn't support pattern, so no `pixmap_src` for it.
+                    pixmap_dst,
+                );
+            }
+        }
+    }
+}
+
+#[rustfmt::skip]
+#[cfg(test)]
+mod blend_tests {
+    // Test blending modes.
+    //
+    // Skia has two kinds of a raster pipeline: high and low precision.
+    // "High" uses f32 and "low" uses u16.
+    // And for basic operations we don't need f32 and u16 simply faster.
+    // But those modes are not identical. They can produce slightly different results
+    // due rounding.
+
+    use super::*;
+    use crate::{BlendMode, Color, Pixmap, PremultipliedColorU8};
+
+    macro_rules! test_blend {
+        ($name:ident, $mode:expr, $is_highp:expr, $r:expr, $g:expr, $b:expr, $a:expr) => {
+            #[test]
+            fn $name() {
+                let mut pixmap = Pixmap::new(1, 1).unwrap();
+                pixmap.fill(Color::from_rgba8(50, 127, 150, 200));
+
+                let pixmap_src = PixmapRef::from_bytes(&[0, 0, 0, 0], 1, 1).unwrap();
+
+                let mut p = RasterPipelineBuilder::new();
+                p.set_force_hq_pipeline($is_highp);
+                p.push_uniform_color(Color::from_rgba8(220, 140, 75, 180).premultiply());
+                p.push(Stage::LoadDestination);
+                p.push($mode.to_stage().unwrap());
+                p.push(Stage::Store);
+                let mut p = p.compile();
+                let rect = pixmap.size().to_screen_int_rect(0, 0);
+                p.run(&rect, AAMaskCtx::default(), ClipMaskCtx::default(), pixmap_src,
+                      &mut pixmap.as_mut().as_subpixmap());
+
+                assert_eq!(
+                    pixmap.as_ref().pixel(0, 0).unwrap(),
+                    PremultipliedColorU8::from_rgba($r, $g, $b, $a).unwrap()
+                );
+            }
+        };
+    }
+
+    macro_rules! test_blend_lowp {
+        ($name:ident, $mode:expr, $r:expr, $g:expr, $b:expr, $a:expr) => (
+            test_blend!{$name, $mode, false, $r, $g, $b, $a}
+        )
+    }
+
+    macro_rules! test_blend_highp {
+        ($name:ident, $mode:expr, $r:expr, $g:expr, $b:expr, $a:expr) => (
+            test_blend!{$name, $mode, true, $r, $g, $b, $a}
+        )
+    }
+
+    test_blend_lowp!(clear_lowp,              BlendMode::Clear,                 0,   0,   0,   0);
+    // Source is a no-op
+    test_blend_lowp!(destination_lowp,        BlendMode::Destination,          39, 100, 118, 200);
+    test_blend_lowp!(source_over_lowp,        BlendMode::SourceOver,          167, 129,  88, 239);
+    test_blend_lowp!(destination_over_lowp,   BlendMode::DestinationOver,      73, 122, 130, 239);
+    test_blend_lowp!(source_in_lowp,          BlendMode::SourceIn,            122,  78,  42, 141);
+    test_blend_lowp!(destination_in_lowp,     BlendMode::DestinationIn,        28,  71,  83, 141);
+    test_blend_lowp!(source_out_lowp,         BlendMode::SourceOut,            34,  22,  12,  39);
+    test_blend_lowp!(destination_out_lowp,    BlendMode::DestinationOut,       12,  30,  35,  59);
+    test_blend_lowp!(source_atop_lowp,        BlendMode::SourceAtop,          133, 107,  76, 200);
+    test_blend_lowp!(destination_atop_lowp,   BlendMode::DestinationAtop,      61,  92,  95, 180);
+    test_blend_lowp!(xor_lowp,                BlendMode::Xor,                  45,  51,  46,  98);
+    test_blend_lowp!(plus_lowp,               BlendMode::Plus,                194, 199, 171, 255);
+    test_blend_lowp!(modulate_lowp,           BlendMode::Modulate,             24,  39,  25, 141);
+    test_blend_lowp!(screen_lowp,             BlendMode::Screen,              170, 160, 146, 239);
+    test_blend_lowp!(overlay_lowp,            BlendMode::Overlay,              92, 128, 106, 239);
+    test_blend_lowp!(darken_lowp,             BlendMode::Darken,               72, 121,  88, 239);
+    test_blend_lowp!(lighten_lowp,            BlendMode::Lighten,             166, 128, 129, 239);
+    // ColorDodge in not available for lowp.
+    // ColorBurn in not available for lowp.
+    test_blend_lowp!(hard_light_lowp,         BlendMode::HardLight,           154, 128,  95, 239);
+    // SoftLight in not available for lowp.
+    test_blend_lowp!(difference_lowp,         BlendMode::Difference,          138,  57,  87, 239);
+    test_blend_lowp!(exclusion_lowp,          BlendMode::Exclusion,           146, 121, 121, 239);
+    test_blend_lowp!(multiply_lowp,           BlendMode::Multiply,             69,  90,  71, 238);
+    // Hue in not available for lowp.
+    // Saturation in not available for lowp.
+    // Color in not available for lowp.
+    // Luminosity in not available for lowp.
+
+    test_blend_highp!(clear_highp,            BlendMode::Clear,                 0,   0,   0,   0);
+    // Source is a no-op
+    test_blend_highp!(destination_highp,      BlendMode::Destination,          39, 100, 118, 200);
+    test_blend_highp!(source_over_highp,      BlendMode::SourceOver,          167, 128,  88, 239);
+    test_blend_highp!(destination_over_highp, BlendMode::DestinationOver,      72, 121, 129, 239);
+    test_blend_highp!(source_in_highp,        BlendMode::SourceIn,            122,  78,  42, 141);
+    test_blend_highp!(destination_in_highp,   BlendMode::DestinationIn,        28,  71,  83, 141);
+    test_blend_highp!(source_out_highp,       BlendMode::SourceOut,            33,  21,  11,  39);
+    test_blend_highp!(destination_out_highp,  BlendMode::DestinationOut,       11,  29,  35,  59);
+    test_blend_highp!(source_atop_highp,      BlendMode::SourceAtop,          133, 107,  76, 200);
+    test_blend_highp!(destination_atop_highp, BlendMode::DestinationAtop,      61,  92,  95, 180);
+    test_blend_highp!(xor_highp,              BlendMode::Xor,                  45,  51,  46,  98);
+    test_blend_highp!(plus_highp,             BlendMode::Plus,                194, 199, 171, 255);
+    test_blend_highp!(modulate_highp,         BlendMode::Modulate,             24,  39,  24, 141);
+    test_blend_highp!(screen_highp,           BlendMode::Screen,              171, 160, 146, 239);
+    test_blend_highp!(overlay_highp,          BlendMode::Overlay,              92, 128, 106, 239);
+    test_blend_highp!(darken_highp,           BlendMode::Darken,               72, 121,  88, 239);
+    test_blend_highp!(lighten_highp,          BlendMode::Lighten,             167, 128, 129, 239);
+    test_blend_highp!(color_dodge_highp,      BlendMode::ColorDodge,          186, 192, 164, 239);
+    test_blend_highp!(color_burn_highp,       BlendMode::ColorBurn,            54,  63,  46, 239);
+    test_blend_highp!(hard_light_highp,       BlendMode::HardLight,           155, 128,  95, 239);
+    test_blend_highp!(soft_light_highp,       BlendMode::SoftLight,            98, 124, 115, 239);
+    test_blend_highp!(difference_highp,       BlendMode::Difference,          139,  58,  88, 239);
+    test_blend_highp!(exclusion_highp,        BlendMode::Exclusion,           147, 121, 122, 239);
+    test_blend_highp!(multiply_highp,         BlendMode::Multiply,             69,  89,  71, 239);
+    test_blend_highp!(hue_highp,              BlendMode::Hue,                 128, 103,  74, 239);
+    test_blend_highp!(saturation_highp,       BlendMode::Saturation,           59, 126, 140, 239);
+    test_blend_highp!(color_highp,            BlendMode::Color,               139, 100,  60, 239);
+    test_blend_highp!(luminosity_highp,       BlendMode::Luminosity,          100, 149, 157, 239);
+}
--- a/third-party/vendor/tiny-skia/src/pixmap.rs
+++ b/third-party/vendor/tiny-skia/src/pixmap.rs
@ -0,0 +1,602 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use alloc::vec;
+use alloc::vec::Vec;
+
+use core::convert::TryFrom;
+use core::num::NonZeroUsize;
+
+use tiny_skia_path::{IntSize, ScreenIntRect};
+
+use crate::{Color, IntRect};
+
+use crate::color::PremultipliedColorU8;
+
+#[cfg(feature = "png-format")]
+use crate::color::{premultiply_u8, ALPHA_U8_OPAQUE};
+
+/// Number of bytes per pixel.
+pub const BYTES_PER_PIXEL: usize = 4;
+
+/// A container that owns premultiplied RGBA pixels.
+///
+/// The data is not aligned, therefore width == stride.
+#[derive(Clone, PartialEq)]
+pub struct Pixmap {
+    data: Vec<u8>,
+    size: IntSize,
+}
+
+impl Pixmap {
+    /// Allocates a new pixmap.
+    ///
+    /// A pixmap is filled with transparent black by default, aka (0, 0, 0, 0).
+    ///
+    /// Zero size in an error.
+    ///
+    /// Pixmap's width is limited by i32::MAX/4.
+    pub fn new(width: u32, height: u32) -> Option<Self> {
+        let size = IntSize::from_wh(width, height)?;
+        let data_len = data_len_for_size(size)?;
+
+        // We cannot check that allocation was successful yet.
+        // We have to wait for https://github.com/rust-lang/rust/issues/48043
+
+        Some(Pixmap {
+            data: vec![0; data_len],
+            size,
+        })
+    }
+
+    /// Creates a new pixmap by taking ownership over an image buffer
+    /// (premultiplied RGBA pixels).
+    ///
+    /// The size needs to match the data provided.
+    ///
+    /// Pixmap's width is limited by i32::MAX/4.
+    pub fn from_vec(data: Vec<u8>, size: IntSize) -> Option<Self> {
+        let data_len = data_len_for_size(size)?;
+        if data.len() != data_len {
+            return None;
+        }
+
+        Some(Pixmap { data, size })
+    }
+
+    /// Decodes a PNG data into a `Pixmap`.
+    ///
+    /// Only 8-bit images are supported.
+    /// Index PNGs are not supported.
+    #[cfg(feature = "png-format")]
+    pub fn decode_png(data: &[u8]) -> Result<Self, png::DecodingError> {
+        fn make_custom_png_error(msg: &str) -> png::DecodingError {
+            std::io::Error::new(std::io::ErrorKind::Other, msg).into()
+        }
+
+        let mut decoder = png::Decoder::new(data);
+        decoder.set_transformations(png::Transformations::normalize_to_color8());
+        let mut reader = decoder.read_info()?;
+        let mut img_data = vec![0; reader.output_buffer_size()];
+        let info = reader.next_frame(&mut img_data)?;
+
+        if info.bit_depth != png::BitDepth::Eight {
+            return Err(make_custom_png_error("unsupported bit depth"));
+        }
+
+        let size = IntSize::from_wh(info.width, info.height)
+            .ok_or_else(|| make_custom_png_error("invalid image size"))?;
+        let data_len =
+            data_len_for_size(size).ok_or_else(|| make_custom_png_error("image is too big"))?;
+
+        img_data = match info.color_type {
+            png::ColorType::Rgb => {
+                let mut rgba_data = Vec::with_capacity(data_len);
+                for rgb in img_data.chunks(3) {
+                    rgba_data.push(rgb[0]);
+                    rgba_data.push(rgb[1]);
+                    rgba_data.push(rgb[2]);
+                    rgba_data.push(ALPHA_U8_OPAQUE);
+                }
+
+                rgba_data
+            }
+            png::ColorType::Rgba => img_data,
+            png::ColorType::Grayscale => {
+                let mut rgba_data = Vec::with_capacity(data_len);
+                for gray in img_data {
+                    rgba_data.push(gray);
+                    rgba_data.push(gray);
+                    rgba_data.push(gray);
+                    rgba_data.push(ALPHA_U8_OPAQUE);
+                }
+
+                rgba_data
+            }
+            png::ColorType::GrayscaleAlpha => {
+                let mut rgba_data = Vec::with_capacity(data_len);
+                for slice in img_data.chunks(2) {
+                    let gray = slice[0];
+                    let alpha = slice[1];
+                    rgba_data.push(gray);
+                    rgba_data.push(gray);
+                    rgba_data.push(gray);
+                    rgba_data.push(alpha);
+                }
+
+                rgba_data
+            }
+            png::ColorType::Indexed => {
+                return Err(make_custom_png_error("indexed PNG is not supported"));
+            }
+        };
+
+        // Premultiply alpha.
+        //
+        // We cannon use RasterPipeline here, which is faster,
+        // because it produces slightly different results.
+        // Seems like Skia does the same.
+        //
+        // Also, in our tests unsafe version (no bound checking)
+        // had roughly the same performance. So we keep the safe one.
+        for pixel in img_data.as_mut_slice().chunks_mut(BYTES_PER_PIXEL) {
+            let a = pixel[3];
+            pixel[0] = premultiply_u8(pixel[0], a);
+            pixel[1] = premultiply_u8(pixel[1], a);
+            pixel[2] = premultiply_u8(pixel[2], a);
+        }
+
+        Pixmap::from_vec(img_data, size)
+            .ok_or_else(|| make_custom_png_error("failed to create a pixmap"))
+    }
+
+    /// Loads a PNG file into a `Pixmap`.
+    ///
+    /// Only 8-bit images are supported.
+    /// Index PNGs are not supported.
+    #[cfg(feature = "png-format")]
+    pub fn load_png<P: AsRef<std::path::Path>>(path: P) -> Result<Self, png::DecodingError> {
+        // `png::Decoder` is generic over input, which means that it will instance
+        // two copies: one for `&[]` and one for `File`. Which will simply bloat the code.
+        // Therefore we're using only one type for input.
+        let data = std::fs::read(path)?;
+        Self::decode_png(&data)
+    }
+
+    /// Encodes pixmap into a PNG data.
+    #[cfg(feature = "png-format")]
+    pub fn encode_png(&self) -> Result<Vec<u8>, png::EncodingError> {
+        self.as_ref().encode_png()
+    }
+
+    /// Saves pixmap as a PNG file.
+    #[cfg(feature = "png-format")]
+    pub fn save_png<P: AsRef<std::path::Path>>(&self, path: P) -> Result<(), png::EncodingError> {
+        self.as_ref().save_png(path)
+    }
+
+    /// Returns a container that references Pixmap's data.
+    pub fn as_ref(&self) -> PixmapRef {
+        PixmapRef {
+            data: &self.data,
+            size: self.size,
+        }
+    }
+
+    /// Returns a container that references Pixmap's data.
+    pub fn as_mut(&mut self) -> PixmapMut {
+        PixmapMut {
+            data: &mut self.data,
+            size: self.size,
+        }
+    }
+
+    /// Returns pixmap's width.
+    #[inline]
+    pub fn width(&self) -> u32 {
+        self.size.width()
+    }
+
+    /// Returns pixmap's height.
+    #[inline]
+    pub fn height(&self) -> u32 {
+        self.size.height()
+    }
+
+    /// Returns pixmap's size.
+    #[allow(dead_code)]
+    pub(crate) fn size(&self) -> IntSize {
+        self.size
+    }
+
+    /// Fills the entire pixmap with a specified color.
+    pub fn fill(&mut self, color: Color) {
+        let c = color.premultiply().to_color_u8();
+        for p in self.as_mut().pixels_mut() {
+            *p = c;
+        }
+    }
+
+    /// Returns the internal data.
+    ///
+    /// Byteorder: RGBA
+    pub fn data(&self) -> &[u8] {
+        self.data.as_slice()
+    }
+
+    /// Returns the mutable internal data.
+    ///
+    /// Byteorder: RGBA
+    pub fn data_mut(&mut self) -> &mut [u8] {
+        self.data.as_mut_slice()
+    }
+
+    /// Returns a pixel color.
+    ///
+    /// Returns `None` when position is out of bounds.
+    pub fn pixel(&self, x: u32, y: u32) -> Option<PremultipliedColorU8> {
+        let idx = self.width().checked_mul(y)?.checked_add(x)?;
+        self.pixels().get(idx as usize).cloned()
+    }
+
+    /// Returns a mutable slice of pixels.
+    pub fn pixels_mut(&mut self) -> &mut [PremultipliedColorU8] {
+        bytemuck::cast_slice_mut(self.data_mut())
+    }
+
+    /// Returns a slice of pixels.
+    pub fn pixels(&self) -> &[PremultipliedColorU8] {
+        bytemuck::cast_slice(self.data())
+    }
+
+    /// Consumes the internal data.
+    ///
+    /// Byteorder: RGBA
+    pub fn take(self) -> Vec<u8> {
+        self.data
+    }
+
+    /// Returns a copy of the pixmap that intersects the `rect`.
+    ///
+    /// Returns `None` when `Pixmap`'s rect doesn't contain `rect`.
+    pub fn clone_rect(&self, rect: IntRect) -> Option<Pixmap> {
+        self.as_ref().clone_rect(rect)
+    }
+}
+
+impl core::fmt::Debug for Pixmap {
+    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
+        f.debug_struct("Pixmap")
+            .field("data", &"...")
+            .field("width", &self.size.width())
+            .field("height", &self.size.height())
+            .finish()
+    }
+}
+
+/// A container that references premultiplied RGBA pixels.
+///
+/// Can be created from `Pixmap` or from a user provided data.
+///
+/// The data is not aligned, therefore width == stride.
+#[derive(Clone, Copy, PartialEq)]
+pub struct PixmapRef<'a> {
+    data: &'a [u8],
+    size: IntSize,
+}
+
+impl<'a> PixmapRef<'a> {
+    /// Creates a new `PixmapRef` from bytes.
+    ///
+    /// The size must be at least `size.width() * size.height() * BYTES_PER_PIXEL`.
+    /// Zero size in an error. Width is limited by i32::MAX/4.
+    ///
+    /// The `data` is assumed to have premultiplied RGBA pixels (byteorder: RGBA).
+    pub fn from_bytes(data: &'a [u8], width: u32, height: u32) -> Option<Self> {
+        let size = IntSize::from_wh(width, height)?;
+        let data_len = data_len_for_size(size)?;
+        if data.len() < data_len {
+            return None;
+        }
+
+        Some(PixmapRef { data, size })
+    }
+
+    /// Creates a new `Pixmap` from the current data.
+    ///
+    /// Clones the underlying data.
+    pub fn to_owned(&self) -> Pixmap {
+        Pixmap {
+            data: self.data.to_vec(),
+            size: self.size,
+        }
+    }
+
+    /// Returns pixmap's width.
+    #[inline]
+    pub fn width(&self) -> u32 {
+        self.size.width()
+    }
+
+    /// Returns pixmap's height.
+    #[inline]
+    pub fn height(&self) -> u32 {
+        self.size.height()
+    }
+
+    /// Returns pixmap's size.
+    pub(crate) fn size(&self) -> IntSize {
+        self.size
+    }
+
+    /// Returns pixmap's rect.
+    pub(crate) fn rect(&self) -> ScreenIntRect {
+        self.size.to_screen_int_rect(0, 0)
+    }
+
+    /// Returns the internal data.
+    ///
+    /// Byteorder: RGBA
+    pub fn data(&self) -> &'a [u8] {
+        self.data
+    }
+
+    /// Returns a pixel color.
+    ///
+    /// Returns `None` when position is out of bounds.
+    pub fn pixel(&self, x: u32, y: u32) -> Option<PremultipliedColorU8> {
+        let idx = self.width().checked_mul(y)?.checked_add(x)?;
+        self.pixels().get(idx as usize).cloned()
+    }
+
+    /// Returns a slice of pixels.
+    pub fn pixels(&self) -> &'a [PremultipliedColorU8] {
+        bytemuck::cast_slice(self.data())
+    }
+
+    // TODO: add rows() iterator
+
+    /// Returns a copy of the pixmap that intersects the `rect`.
+    ///
+    /// Returns `None` when `Pixmap`'s rect doesn't contain `rect`.
+    pub fn clone_rect(&self, rect: IntRect) -> Option<Pixmap> {
+        // TODO: to ScreenIntRect?
+
+        let rect = self.rect().to_int_rect().intersect(&rect)?;
+        let mut new = Pixmap::new(rect.width(), rect.height())?;
+        {
+            let old_pixels = self.pixels();
+            let mut new_mut = new.as_mut();
+            let new_pixels = new_mut.pixels_mut();
+
+            // TODO: optimize
+            for y in 0..rect.height() {
+                for x in 0..rect.width() {
+                    let old_idx = (y + rect.y() as u32) * self.width() + (x + rect.x() as u32);
+                    let new_idx = y * rect.width() + x;
+                    new_pixels[new_idx as usize] = old_pixels[old_idx as usize];
+                }
+            }
+        }
+
+        Some(new)
+    }
+
+    /// Encodes pixmap into a PNG data.
+    #[cfg(feature = "png-format")]
+    pub fn encode_png(&self) -> Result<Vec<u8>, png::EncodingError> {
+        // Skia uses skcms here, which is somewhat similar to RasterPipeline.
+
+        // Sadly, we have to copy the pixmap here, because of demultiplication.
+        // Not sure how to avoid this.
+        // TODO: remove allocation
+        let mut tmp_pixmap = self.to_owned();
+
+        // Demultiply alpha.
+        //
+        // RasterPipeline is 15% faster here, but produces slightly different results
+        // due to rounding. So we stick with this method for now.
+        for pixel in tmp_pixmap.pixels_mut() {
+            let c = pixel.demultiply();
+            *pixel =
+                PremultipliedColorU8::from_rgba_unchecked(c.red(), c.green(), c.blue(), c.alpha());
+        }
+
+        let mut data = Vec::new();
+        {
+            let mut encoder = png::Encoder::new(&mut data, self.width(), self.height());
+            encoder.set_color(png::ColorType::Rgba);
+            encoder.set_depth(png::BitDepth::Eight);
+            let mut writer = encoder.write_header()?;
+            writer.write_image_data(&tmp_pixmap.data)?;
+        }
+
+        Ok(data)
+    }
+
+    /// Saves pixmap as a PNG file.
+    #[cfg(feature = "png-format")]
+    pub fn save_png<P: AsRef<std::path::Path>>(&self, path: P) -> Result<(), png::EncodingError> {
+        let data = self.encode_png()?;
+        std::fs::write(path, data)?;
+        Ok(())
+    }
+}
+
+impl core::fmt::Debug for PixmapRef<'_> {
+    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
+        f.debug_struct("PixmapRef")
+            .field("data", &"...")
+            .field("width", &self.size.width())
+            .field("height", &self.size.height())
+            .finish()
+    }
+}
+
+/// A container that references mutable premultiplied RGBA pixels.
+///
+/// Can be created from `Pixmap` or from a user provided data.
+///
+/// The data is not aligned, therefore width == stride.
+#[derive(PartialEq)]
+pub struct PixmapMut<'a> {
+    data: &'a mut [u8],
+    size: IntSize,
+}
+
+impl<'a> PixmapMut<'a> {
+    /// Creates a new `PixmapMut` from bytes.
+    ///
+    /// The size must be at least `size.width() * size.height() * BYTES_PER_PIXEL`.
+    /// Zero size in an error. Width is limited by i32::MAX/4.
+    ///
+    /// The `data` is assumed to have premultiplied RGBA pixels (byteorder: RGBA).
+    pub fn from_bytes(data: &'a mut [u8], width: u32, height: u32) -> Option<Self> {
+        let size = IntSize::from_wh(width, height)?;
+        let data_len = data_len_for_size(size)?;
+        if data.len() < data_len {
+            return None;
+        }
+
+        Some(PixmapMut { data, size })
+    }
+
+    /// Creates a new `Pixmap` from the current data.
+    ///
+    /// Clones the underlying data.
+    pub fn to_owned(&self) -> Pixmap {
+        Pixmap {
+            data: self.data.to_vec(),
+            size: self.size,
+        }
+    }
+
+    /// Returns a container that references Pixmap's data.
+    pub fn as_ref(&self) -> PixmapRef {
+        PixmapRef {
+            data: self.data,
+            size: self.size,
+        }
+    }
+
+    /// Returns pixmap's width.
+    #[inline]
+    pub fn width(&self) -> u32 {
+        self.size.width()
+    }
+
+    /// Returns pixmap's height.
+    #[inline]
+    pub fn height(&self) -> u32 {
+        self.size.height()
+    }
+
+    /// Returns pixmap's size.
+    pub(crate) fn size(&self) -> IntSize {
+        self.size
+    }
+
+    /// Fills the entire pixmap with a specified color.
+    pub fn fill(&mut self, color: Color) {
+        let c = color.premultiply().to_color_u8();
+        for p in self.pixels_mut() {
+            *p = c;
+        }
+    }
+
+    /// Returns the mutable internal data.
+    ///
+    /// Byteorder: RGBA
+    pub fn data_mut(&mut self) -> &mut [u8] {
+        self.data
+    }
+
+    /// Returns a mutable slice of pixels.
+    pub fn pixels_mut(&mut self) -> &mut [PremultipliedColorU8] {
+        bytemuck::cast_slice_mut(self.data_mut())
+    }
+
+    /// Creates `SubPixmapMut` that contains the whole `PixmapMut`.
+    pub(crate) fn as_subpixmap(&mut self) -> SubPixmapMut {
+        SubPixmapMut {
+            size: self.size(),
+            real_width: self.width() as usize,
+            data: &mut self.data,
+        }
+    }
+
+    /// Returns a mutable reference to the pixmap region that intersects the `rect`.
+    ///
+    /// Returns `None` when `Pixmap`'s rect doesn't contain `rect`.
+    pub(crate) fn subpixmap(&mut self, rect: IntRect) -> Option<SubPixmapMut> {
+        let rect = self.size.to_int_rect(0, 0).intersect(&rect)?;
+        let row_bytes = self.width() as usize * BYTES_PER_PIXEL;
+        let offset = rect.top() as usize * row_bytes + rect.left() as usize * BYTES_PER_PIXEL;
+
+        Some(SubPixmapMut {
+            size: rect.size(),
+            real_width: self.width() as usize,
+            data: &mut self.data[offset..],
+        })
+    }
+}
+
+impl core::fmt::Debug for PixmapMut<'_> {
+    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
+        f.debug_struct("PixmapMut")
+            .field("data", &"...")
+            .field("width", &self.size.width())
+            .field("height", &self.size.height())
+            .finish()
+    }
+}
+
+/// A `PixmapMut` subregion.
+///
+/// Unlike `PixmapMut`, contains `real_width` which references the parent `PixmapMut` width.
+/// This way we can operate on a `PixmapMut` subregion without reallocations.
+/// Primarily required because of `DrawTiler`.
+///
+/// We cannot implement it in `PixmapMut` directly, because it will brake `fill`, `data_mut`
+/// `pixels_mut` and other similar methods.
+/// This is because `SubPixmapMut.data` references more "data" than it actually allowed to access.
+/// On the other hand, `PixmapMut.data` can access all it's data and it's stored linearly.
+pub struct SubPixmapMut<'a> {
+    pub data: &'a mut [u8],
+    pub size: IntSize,
+    pub real_width: usize,
+}
+
+impl<'a> SubPixmapMut<'a> {
+    /// Returns a mutable slice of pixels.
+    pub fn pixels_mut(&mut self) -> &mut [PremultipliedColorU8] {
+        bytemuck::cast_slice_mut(self.data)
+    }
+}
+
+/// Returns minimum bytes per row as usize.
+///
+/// Pixmap's maximum value for row bytes must fit in 31 bits.
+fn min_row_bytes(size: IntSize) -> Option<NonZeroUsize> {
+    let w = i32::try_from(size.width()).ok()?;
+    let w = w.checked_mul(BYTES_PER_PIXEL as i32)?;
+    NonZeroUsize::new(w as usize)
+}
+
+/// Returns storage size required by pixel array.
+fn compute_data_len(size: IntSize, row_bytes: usize) -> Option<usize> {
+    let h = size.height().checked_sub(1)?;
+    let h = (h as usize).checked_mul(row_bytes)?;
+
+    let w = (size.width() as usize).checked_mul(BYTES_PER_PIXEL)?;
+
+    h.checked_add(w)
+}
+
+fn data_len_for_size(size: IntSize) -> Option<usize> {
+    let row_bytes = min_row_bytes(size)?;
+    compute_data_len(size, row_bytes.get())
+}
--- a/third-party/vendor/tiny-skia/src/scan/hairline.rs
+++ b/third-party/vendor/tiny-skia/src/scan/hairline.rs
@ -0,0 +1,629 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use core::convert::TryInto;
+
+use tiny_skia_path::{f32x2, PathVerb, SaturateCast, Scalar, ScreenIntRect};
+
+use crate::{IntRect, LineCap, Path, PathSegment, Point, Rect};
+
+use crate::blitter::Blitter;
+use crate::fixed_point::{fdot16, fdot6};
+use crate::line_clipper;
+use crate::math::LENGTH_U32_ONE;
+use crate::path_geometry;
+
+#[cfg(all(not(feature = "std"), feature = "no-std-float"))]
+use tiny_skia_path::NoStdFloat;
+
+const FLOAT_PI: f32 = 3.14159265;
+
+pub type LineProc = fn(&[Point], Option<&ScreenIntRect>, &mut dyn Blitter) -> Option<()>;
+
+const MAX_CUBIC_SUBDIVIDE_LEVEL: u8 = 9;
+const MAX_QUAD_SUBDIVIDE_LEVEL: u8 = 5;
+
+pub fn stroke_path(
+    path: &Path,
+    line_cap: LineCap,
+    clip: &ScreenIntRect,
+    blitter: &mut dyn Blitter,
+) -> Option<()> {
+    super::hairline::stroke_path_impl(path, line_cap, clip, hair_line_rgn, blitter)
+}
+
+fn hair_line_rgn(
+    points: &[Point],
+    clip: Option<&ScreenIntRect>,
+    blitter: &mut dyn Blitter,
+) -> Option<()> {
+    let max = 32767.0;
+    let fixed_bounds = Rect::from_ltrb(-max, -max, max, max)?;
+
+    let clip_bounds = clip.map(|c| c.to_rect());
+
+    for i in 0..points.len() - 1 {
+        let mut pts = [Point::zero(); 2];
+
+        // We have to pre-clip the line to fit in a Fixed, so we just chop the line.
+        if !line_clipper::intersect(&[points[i], points[i + 1]], &fixed_bounds, &mut pts) {
+            continue;
+        }
+
+        if let Some(clip_bounds) = clip_bounds {
+            let tmp = pts.clone();
+            // Perform a clip in scalar space, so we catch huge values which might
+            // be missed after we convert to FDot6 (overflow).
+            if !line_clipper::intersect(&tmp, &clip_bounds, &mut pts) {
+                continue;
+            }
+        }
+
+        let mut x0 = fdot6::from_f32(pts[0].x);
+        let mut y0 = fdot6::from_f32(pts[0].y);
+        let mut x1 = fdot6::from_f32(pts[1].x);
+        let mut y1 = fdot6::from_f32(pts[1].y);
+
+        debug_assert!(fdot6::can_convert_to_fdot16(x0));
+        debug_assert!(fdot6::can_convert_to_fdot16(y0));
+        debug_assert!(fdot6::can_convert_to_fdot16(x1));
+        debug_assert!(fdot6::can_convert_to_fdot16(y1));
+
+        let dx = x1 - x0;
+        let dy = y1 - y0;
+
+        if dx.abs() > dy.abs() {
+            // mostly horizontal
+
+            if x0 > x1 {
+                // we want to go left-to-right
+                core::mem::swap(&mut x0, &mut x1);
+                core::mem::swap(&mut y0, &mut y1);
+            }
+
+            let mut ix0 = fdot6::round(x0);
+            let ix1 = fdot6::round(x1);
+            if ix0 == ix1 {
+                // too short to draw
+                continue;
+            }
+
+            let slope = fdot16::div(dy, dx);
+            #[allow(clippy::precedence)]
+            let mut start_y = fdot6::to_fdot16(y0) + (slope * ((32 - x0) & 63) >> 6);
+
+            // In some cases, probably due to precision/rounding issues,
+            // `start_y` can become equal to the image height,
+            // which will lead to panic, because we would be accessing pixels outside
+            // the current memory buffer.
+            // This is tiny-skia specific issue. Skia handles this part differently.
+            let max_y = if let Some(clip_bounds) = clip_bounds {
+                fdot16::from_f32(clip_bounds.bottom())
+            } else {
+                i32::MAX
+            };
+
+            debug_assert!(ix0 < ix1);
+            loop {
+                if ix0 >= 0 && start_y >= 0 && start_y < max_y {
+                    blitter.blit_h(ix0 as u32, (start_y >> 16) as u32, LENGTH_U32_ONE);
+                }
+
+                start_y += slope;
+                ix0 += 1;
+                if ix0 >= ix1 {
+                    break;
+                }
+            }
+        } else {
+            // mostly vertical
+
+            if y0 > y1 {
+                // we want to go top-to-bottom
+                core::mem::swap(&mut x0, &mut x1);
+                core::mem::swap(&mut y0, &mut y1);
+            }
+
+            let mut iy0 = fdot6::round(y0);
+            let iy1 = fdot6::round(y1);
+            if iy0 == iy1 {
+                // too short to draw
+                continue;
+            }
+
+            let slope = fdot16::div(dx, dy);
+            #[allow(clippy::precedence)]
+            let mut start_x = fdot6::to_fdot16(x0) + (slope * ((32 - y0) & 63) >> 6);
+
+            debug_assert!(iy0 < iy1);
+            loop {
+                if start_x >= 0 && iy0 >= 0 {
+                    blitter.blit_h((start_x >> 16) as u32, iy0 as u32, LENGTH_U32_ONE);
+                }
+
+                start_x += slope;
+                iy0 += 1;
+                if iy0 >= iy1 {
+                    break;
+                }
+            }
+        }
+    }
+
+    Some(())
+}
+
+pub fn stroke_path_impl(
+    path: &Path,
+    line_cap: LineCap,
+    clip: &ScreenIntRect,
+    line_proc: LineProc,
+    blitter: &mut dyn Blitter,
+) -> Option<()> {
+    let mut inset_clip = None;
+    let mut outset_clip = None;
+
+    {
+        let cap_out = if line_cap == LineCap::Butt { 1.0 } else { 2.0 };
+        let ibounds = path.bounds().outset(cap_out, cap_out)?.round_out()?;
+        clip.to_int_rect().intersect(&ibounds)?;
+
+        if !clip.to_int_rect().contains(&ibounds) {
+            // We now cache two scalar rects, to use for culling per-segment (e.g. cubic).
+            // Since we're hairlining, the "bounds" of the control points isn't necessairly the
+            // limit of where a segment can draw (it might draw up to 1 pixel beyond in aa-hairs).
+            //
+            // Compute the pt-bounds per segment is easy, so we do that, and then inversely adjust
+            // the culling bounds so we can just do a straight compare per segment.
+            //
+            // insetClip is use for quick-accept (i.e. the segment is not clipped), so we inset
+            // it from the clip-bounds (since segment bounds can be off by 1).
+            //
+            // outsetClip is used for quick-reject (i.e. the segment is entirely outside), so we
+            // outset it from the clip-bounds.
+            outset_clip = Some(clip.to_int_rect().make_outset(1, 1)?);
+            inset_clip = Some(clip.to_int_rect().inset(1, 1)?);
+        }
+    }
+
+    let clip = Some(clip);
+    let mut prev_verb = PathVerb::Move;
+    let mut first_pt = Point::zero();
+    let mut last_pt = Point::zero();
+
+    let mut iter = path.segments();
+    while let Some(segment) = iter.next() {
+        let verb = iter.curr_verb();
+        let next_verb = iter.next_verb();
+        let last_pt2;
+        match segment {
+            PathSegment::MoveTo(p) => {
+                first_pt = p;
+                last_pt = p;
+                last_pt2 = p;
+            }
+            PathSegment::LineTo(p) => {
+                let mut points = [last_pt, p];
+                if line_cap != LineCap::Butt {
+                    extend_pts(line_cap, prev_verb, next_verb, &mut points);
+                }
+
+                line_proc(&points, clip, blitter);
+                last_pt = p;
+                last_pt2 = points[0];
+            }
+            PathSegment::QuadTo(p0, p1) => {
+                let mut points = [last_pt, p0, p1];
+                if line_cap != LineCap::Butt {
+                    extend_pts(line_cap, prev_verb, next_verb, &mut points);
+                }
+
+                hair_quad(
+                    &points,
+                    clip,
+                    inset_clip.as_ref(),
+                    outset_clip.as_ref(),
+                    compute_quad_level(&points),
+                    line_proc,
+                    blitter,
+                );
+
+                last_pt = p1;
+                last_pt2 = points[0];
+            }
+            PathSegment::CubicTo(p0, p1, p2) => {
+                let mut points = [last_pt, p0, p1, p2];
+                if line_cap != LineCap::Butt {
+                    extend_pts(line_cap, prev_verb, next_verb, &mut points);
+                }
+
+                hair_cubic(
+                    &points,
+                    clip,
+                    inset_clip.as_ref(),
+                    outset_clip.as_ref(),
+                    line_proc,
+                    blitter,
+                );
+
+                last_pt = p2;
+                last_pt2 = points[0];
+            }
+            PathSegment::Close => {
+                let mut points = [last_pt, first_pt];
+                if line_cap != LineCap::Butt && prev_verb == PathVerb::Move {
+                    // cap moveTo/close to match svg expectations for degenerate segments
+                    extend_pts(line_cap, prev_verb, next_verb, &mut points);
+                }
+                line_proc(&points, clip, blitter);
+                last_pt2 = points[0];
+            }
+        }
+
+        if line_cap != LineCap::Butt {
+            if prev_verb == PathVerb::Move
+                && matches!(verb, PathVerb::Line | PathVerb::Quad | PathVerb::Cubic)
+            {
+                first_pt = last_pt2; // the curve moved the initial point, so close to it instead
+            }
+
+            prev_verb = verb;
+        }
+    }
+
+    Some(())
+}
+
+/// Extend the points in the direction of the starting or ending tangent by 1/2 unit to
+/// account for a round or square cap.
+///
+/// If there's no distance between the end point and
+/// the control point, use the next control point to create a tangent. If the curve
+/// is degenerate, move the cap out 1/2 unit horizontally.
+fn extend_pts(
+    line_cap: LineCap,
+    prev_verb: PathVerb,
+    next_verb: Option<PathVerb>,
+    points: &mut [Point],
+) {
+    debug_assert!(!points.is_empty()); // TODO: use non-zero slice
+    debug_assert!(line_cap != LineCap::Butt);
+
+    // The area of a circle is PI*R*R. For a unit circle, R=1/2, and the cap covers half of that.
+    let cap_outset = if line_cap == LineCap::Square {
+        0.5
+    } else {
+        FLOAT_PI / 8.0
+    };
+    if prev_verb == PathVerb::Move {
+        let first = points[0];
+        let mut offset = 0;
+        let mut controls = points.len() - 1;
+        let mut tangent;
+        loop {
+            offset += 1;
+            tangent = first - points[offset];
+
+            if !tangent.is_zero() {
+                break;
+            }
+
+            controls -= 1;
+            if controls == 0 {
+                break;
+            }
+        }
+
+        if tangent.is_zero() {
+            tangent = Point::from_xy(1.0, 0.0);
+            controls = points.len() - 1; // If all points are equal, move all but one.
+        } else {
+            tangent.normalize();
+        }
+
+        offset = 0;
+        loop {
+            // If the end point and control points are equal, loop to move them in tandem.
+            points[offset].x += tangent.x * cap_outset;
+            points[offset].y += tangent.y * cap_outset;
+
+            offset += 1;
+            controls += 1;
+            if controls >= points.len() {
+                break;
+            }
+        }
+    }
+
+    if matches!(
+        next_verb,
+        Some(PathVerb::Move) | Some(PathVerb::Close) | None
+    ) {
+        let last = points.last().unwrap().clone();
+        let mut offset = points.len() - 1;
+        let mut controls = points.len() - 1;
+        let mut tangent;
+        loop {
+            offset -= 1;
+            tangent = last - points[offset];
+
+            if !tangent.is_zero() {
+                break;
+            }
+
+            controls -= 1;
+            if controls == 0 {
+                break;
+            }
+        }
+
+        if tangent.is_zero() {
+            tangent = Point::from_xy(-1.0, 0.0);
+            controls = points.len() - 1;
+        } else {
+            tangent.normalize();
+        }
+
+        offset = points.len() - 1;
+        loop {
+            points[offset].x += tangent.x * cap_outset;
+            points[offset].y += tangent.y * cap_outset;
+
+            offset -= 1;
+            controls += 1;
+            if controls >= points.len() {
+                break;
+            }
+        }
+    }
+}
+
+fn hair_quad(
+    points: &[Point; 3],
+    mut clip: Option<&ScreenIntRect>,
+    inset_clip: Option<&IntRect>,
+    outset_clip: Option<&IntRect>,
+    level: u8,
+    line_proc: LineProc,
+    blitter: &mut dyn Blitter,
+) -> Option<()> {
+    if let Some(inset_clip) = inset_clip {
+        debug_assert!(outset_clip.is_some());
+        let inset_clip = inset_clip.to_rect();
+        let outset_clip = outset_clip?.to_rect();
+
+        let bounds = compute_nocheck_quad_bounds(points)?;
+        if !geometric_overlap(&outset_clip, &bounds) {
+            return Some(());
+        } else if geometric_contains(&inset_clip, &bounds) {
+            clip = None;
+        }
+    }
+
+    hair_quad2(points, clip, level, line_proc, blitter);
+    Some(())
+}
+
+fn compute_nocheck_quad_bounds(points: &[Point; 3]) -> Option<Rect> {
+    debug_assert!(points[0].is_finite());
+    debug_assert!(points[1].is_finite());
+    debug_assert!(points[2].is_finite());
+
+    let mut min = points[0].to_f32x2();
+    let mut max = min;
+    for i in 1..3 {
+        let pair = points[i].to_f32x2();
+        min = min.min(pair);
+        max = max.max(pair);
+    }
+
+    Rect::from_ltrb(min.x(), min.y(), max.x(), max.y())
+}
+
+fn geometric_overlap(a: &Rect, b: &Rect) -> bool {
+    a.left() < b.right() && b.left() < a.right() && a.top() < b.bottom() && b.top() < a.bottom()
+}
+
+fn geometric_contains(outer: &Rect, inner: &Rect) -> bool {
+    inner.right() <= outer.right()
+        && inner.left() >= outer.left()
+        && inner.bottom() <= outer.bottom()
+        && inner.top() >= outer.top()
+}
+
+fn hair_quad2(
+    points: &[Point; 3],
+    clip: Option<&ScreenIntRect>,
+    level: u8,
+    line_proc: LineProc,
+    blitter: &mut dyn Blitter,
+) {
+    debug_assert!(level <= MAX_QUAD_SUBDIVIDE_LEVEL); // TODO: to type
+
+    let coeff = path_geometry::QuadCoeff::from_points(points);
+
+    const MAX_POINTS: usize = (1 << MAX_QUAD_SUBDIVIDE_LEVEL) + 1;
+    let lines = 1 << level;
+    debug_assert!(lines < MAX_POINTS);
+
+    let mut tmp = [Point::zero(); MAX_POINTS];
+    tmp[0] = points[0];
+
+    let mut t = f32x2::default();
+    let dt = f32x2::splat(1.0 / lines as f32);
+    for i in 1..lines {
+        t = t + dt;
+        let v = (coeff.a * t + coeff.b) * t + coeff.c;
+        tmp[i] = Point::from_xy(v.x(), v.y());
+    }
+
+    tmp[lines] = points[2];
+    line_proc(&tmp[0..lines + 1], clip, blitter);
+}
+
+fn compute_quad_level(points: &[Point; 3]) -> u8 {
+    let d = compute_int_quad_dist(points);
+    // Quadratics approach the line connecting their start and end points
+    // 4x closer with each subdivision, so we compute the number of
+    // subdivisions to be the minimum need to get that distance to be less
+    // than a pixel.
+    let mut level = (33 - d.leading_zeros()) >> 1;
+    // sanity check on level (from the previous version)
+    if level > MAX_QUAD_SUBDIVIDE_LEVEL as u32 {
+        level = MAX_QUAD_SUBDIVIDE_LEVEL as u32;
+    }
+
+    level as u8
+}
+
+fn compute_int_quad_dist(points: &[Point; 3]) -> u32 {
+    // compute the vector between the control point ([1]) and the middle of the
+    // line connecting the start and end ([0] and [2])
+    let dx = ((points[0].x + points[2].x).half() - points[1].x).abs();
+    let dy = ((points[0].y + points[2].y).half() - points[1].y).abs();
+
+    // convert to whole pixel values (use ceiling to be conservative).
+    // assign to unsigned so we can safely add 1/2 of the smaller and still fit in
+    // u32, since T::saturate_from() returns 31 bits at most.
+    let idx = i32::saturate_from(dx.ceil()) as u32;
+    let idy = i32::saturate_from(dy.ceil()) as u32;
+
+    // use the cheap approx for distance
+    if idx > idy {
+        idx + (idy >> 1)
+    } else {
+        idy + (idx >> 1)
+    }
+}
+
+fn hair_cubic(
+    points: &[Point; 4],
+    mut clip: Option<&ScreenIntRect>,
+    inset_clip: Option<&IntRect>,
+    outset_clip: Option<&IntRect>,
+    line_proc: LineProc,
+    blitter: &mut dyn Blitter,
+) -> Option<()> {
+    if let Some(inset_clip) = inset_clip {
+        debug_assert!(outset_clip.is_some());
+        let inset_clip = inset_clip.to_rect();
+        let outset_clip = outset_clip?.to_rect();
+
+        let bounds = compute_nocheck_cubic_bounds(points)?;
+        if !geometric_overlap(&outset_clip, &bounds) {
+            return Some(());
+        } else if geometric_contains(&inset_clip, &bounds) {
+            clip = None;
+        }
+    }
+
+    if quick_cubic_niceness_check(points) {
+        hair_cubic2(points, clip, line_proc, blitter);
+    } else {
+        let mut tmp = [Point::zero(); 13];
+        let mut t_values = path_geometry::new_t_values();
+
+        let count = path_geometry::chop_cubic_at_max_curvature(points, &mut t_values, &mut tmp);
+        for i in 0..count {
+            let offset = i * 3;
+            let new_points: [Point; 4] = tmp[offset..offset + 4].try_into().unwrap();
+            hair_cubic2(&new_points, clip, line_proc, blitter);
+        }
+    }
+
+    Some(())
+}
+
+fn compute_nocheck_cubic_bounds(points: &[Point; 4]) -> Option<Rect> {
+    debug_assert!(points[0].is_finite());
+    debug_assert!(points[1].is_finite());
+    debug_assert!(points[2].is_finite());
+    debug_assert!(points[3].is_finite());
+
+    let mut min = points[0].to_f32x2();
+    let mut max = min;
+    for i in 1..4 {
+        let pair = points[i].to_f32x2();
+        min = min.min(pair);
+        max = max.max(pair);
+    }
+
+    Rect::from_ltrb(min.x(), min.y(), max.x(), max.y())
+}
+
+// The off-curve points are "inside" the limits of the on-curve points.
+fn quick_cubic_niceness_check(points: &[Point; 4]) -> bool {
+    lt_90(points[1], points[0], points[3])
+        && lt_90(points[2], points[0], points[3])
+        && lt_90(points[1], points[3], points[0])
+        && lt_90(points[2], points[3], points[0])
+}
+
+fn lt_90(p0: Point, pivot: Point, p2: Point) -> bool {
+    (p0 - pivot).dot(p2 - pivot) >= 0.0
+}
+
+fn hair_cubic2(
+    points: &[Point; 4],
+    clip: Option<&ScreenIntRect>,
+    line_proc: LineProc,
+    blitter: &mut dyn Blitter,
+) {
+    let lines = compute_cubic_segments(points);
+    debug_assert!(lines > 0);
+    if lines == 1 {
+        line_proc(&[points[0], points[3]], clip, blitter);
+        return;
+    }
+
+    let coeff = path_geometry::CubicCoeff::from_points(points);
+
+    const MAX_POINTS: usize = (1 << MAX_CUBIC_SUBDIVIDE_LEVEL) + 1;
+    debug_assert!(lines < MAX_POINTS);
+    let mut tmp = [Point::zero(); MAX_POINTS];
+
+    let dt = f32x2::splat(1.0 / lines as f32);
+    let mut t = f32x2::default();
+
+    tmp[0] = points[0];
+    for i in 1..lines {
+        t = t + dt;
+        tmp[i] = Point::from_f32x2(((coeff.a * t + coeff.b) * t + coeff.c) * t + coeff.d);
+    }
+
+    if tmp.iter().all(|p| p.is_finite()) {
+        tmp[lines] = points[3];
+        line_proc(&tmp[0..lines + 1], clip, blitter);
+    } else {
+        // else some point(s) are non-finite, so don't draw
+    }
+}
+
+fn compute_cubic_segments(points: &[Point; 4]) -> usize {
+    let p0 = points[0].to_f32x2();
+    let p1 = points[1].to_f32x2();
+    let p2 = points[2].to_f32x2();
+    let p3 = points[3].to_f32x2();
+
+    let one_third = f32x2::splat(1.0 / 3.0);
+    let two_third = f32x2::splat(2.0 / 3.0);
+
+    let p13 = one_third * p3 + two_third * p0;
+    let p23 = one_third * p0 + two_third * p3;
+
+    let diff = (p1 - p13).abs().max((p2 - p23).abs()).max_component();
+    let mut tol = 1.0 / 8.0;
+
+    for i in 0..MAX_CUBIC_SUBDIVIDE_LEVEL {
+        if diff < tol {
+            return 1 << i;
+        }
+
+        tol *= 4.0;
+    }
+
+    1 << MAX_CUBIC_SUBDIVIDE_LEVEL
+}
--- a/third-party/vendor/tiny-skia/src/scan/hairline_aa.rs
+++ b/third-party/vendor/tiny-skia/src/scan/hairline_aa.rs
@ -0,0 +1,907 @@
+// Copyright 2011 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use core::convert::TryFrom;
+use core::num::NonZeroU16;
+
+use tiny_skia_path::ScreenIntRect;
+
+use crate::{IntRect, LengthU32, LineCap, Path, Point, Rect};
+
+use crate::alpha_runs::{AlphaRun, AlphaRuns};
+use crate::blitter::Blitter;
+use crate::color::AlphaU8;
+use crate::fixed_point::{fdot16, fdot6, fdot8, FDot16, FDot6, FDot8};
+use crate::line_clipper;
+use crate::math::LENGTH_U32_ONE;
+
+#[derive(Copy, Clone, Debug)]
+struct FixedRect {
+    left: FDot16,
+    top: FDot16,
+    right: FDot16,
+    bottom: FDot16,
+}
+
+impl FixedRect {
+    fn from_rect(src: &Rect) -> Self {
+        FixedRect {
+            left: fdot16::from_f32(src.left()),
+            top: fdot16::from_f32(src.top()),
+            right: fdot16::from_f32(src.right()),
+            bottom: fdot16::from_f32(src.bottom()),
+        }
+    }
+}
+
+/// Multiplies value by 0..256, and shift the result down 8
+/// (i.e. return (value * alpha256) >> 8)
+fn alpha_mul(value: AlphaU8, alpha256: i32) -> u8 {
+    let a = (i32::from(value) * alpha256) >> 8;
+    debug_assert!(a >= 0 && a <= 255);
+    a as u8
+}
+
+pub fn fill_rect(rect: &Rect, clip: &ScreenIntRect, blitter: &mut dyn Blitter) -> Option<()> {
+    let rect = rect.intersect(&clip.to_rect())?;
+    let fr = FixedRect::from_rect(&rect);
+    fill_fixed_rect(&fr, blitter);
+    Some(())
+}
+
+fn fill_fixed_rect(rect: &FixedRect, blitter: &mut dyn Blitter) {
+    fill_dot8(
+        fdot8::from_fdot16(rect.left),
+        fdot8::from_fdot16(rect.top),
+        fdot8::from_fdot16(rect.right),
+        fdot8::from_fdot16(rect.bottom),
+        true,
+        blitter,
+    )
+}
+
+fn fill_dot8(l: FDot8, t: FDot8, r: FDot8, b: FDot8, fill_inner: bool, blitter: &mut dyn Blitter) {
+    fn to_alpha(a: i32) -> u8 {
+        debug_assert!(a >= 0 && a <= 255);
+        a as u8
+    }
+
+    // check for empty now that we're in our reduced precision space
+    if l >= r || t >= b {
+        return;
+    }
+
+    let mut top = t >> 8;
+    if top == ((b - 1) >> 8) {
+        // just one scanline high
+        do_scanline(l, top, r, to_alpha(b - t - 1), blitter);
+        return;
+    }
+
+    if t & 0xFF != 0 {
+        do_scanline(l, top, r, to_alpha(256 - (t & 0xFF)), blitter);
+        top += 1;
+    }
+
+    let bottom = b >> 8;
+    let height = bottom - top;
+    if let Some(height) = u32::try_from(height).ok().and_then(LengthU32::new) {
+        let mut left = l >> 8;
+        if left == ((r - 1) >> 8) {
+            // just 1-pixel wide
+            if let (Ok(left), Ok(top)) = (u32::try_from(left), u32::try_from(top)) {
+                blitter.blit_v(left, top, height, to_alpha(r - l - 1));
+            } else {
+                debug_assert!(false);
+            }
+        } else {
+            if l & 0xFF != 0 {
+                if let (Ok(left), Ok(top)) = (u32::try_from(left), u32::try_from(top)) {
+                    blitter.blit_v(left, top, height, to_alpha(256 - (l & 0xFF)));
+                } else {
+                    debug_assert!(false);
+                }
+
+                left += 1;
+            }
+
+            let right = r >> 8;
+            let width = right - left;
+            if fill_inner {
+                if let Some(width) = u32::try_from(width).ok().and_then(LengthU32::new) {
+                    if let (Ok(left), Ok(top)) = (u32::try_from(left), u32::try_from(top)) {
+                        let rect = ScreenIntRect::from_xywh_safe(left, top, width, height);
+                        blitter.blit_rect(&rect);
+                    } else {
+                        debug_assert!(false);
+                    }
+                } else {
+                    debug_assert!(false);
+                }
+            }
+
+            if r & 0xFF != 0 {
+                if let (Ok(right), Ok(top)) = (u32::try_from(right), u32::try_from(top)) {
+                    blitter.blit_v(right, top, height, to_alpha(r & 0xFF));
+                } else {
+                    debug_assert!(false);
+                }
+            }
+        }
+    }
+
+    if b & 0xFF != 0 {
+        do_scanline(l, bottom, r, to_alpha(b & 0xFF), blitter);
+    }
+}
+
+fn do_scanline(l: FDot8, top: i32, r: FDot8, alpha: AlphaU8, blitter: &mut dyn Blitter) {
+    debug_assert!(l < r);
+
+    let one_len = LENGTH_U32_ONE;
+    let top = match u32::try_from(top) {
+        Ok(n) => n,
+        _ => return,
+    };
+
+    if (l >> 8) == ((r - 1) >> 8) {
+        // 1x1 pixel
+        if let Ok(left) = u32::try_from(l >> 8) {
+            blitter.blit_v(left, top, one_len, alpha_mul(alpha, r - l));
+        }
+
+        return;
+    }
+
+    let mut left = l >> 8;
+
+    if l & 0xFF != 0 {
+        if let Ok(left) = u32::try_from(l >> 8) {
+            blitter.blit_v(left, top, one_len, alpha_mul(alpha, 256 - (l & 0xFF)));
+        }
+
+        left += 1;
+    }
+
+    let right = r >> 8;
+    let width = right - left;
+    if let Some(width) = u32::try_from(width).ok().and_then(LengthU32::new) {
+        if let Ok(left) = u32::try_from(left) {
+            call_hline_blitter(left, Some(top), width, alpha, blitter);
+        }
+    }
+
+    if r & 0xFF != 0 {
+        if let Ok(right) = u32::try_from(right) {
+            blitter.blit_v(right, top, one_len, alpha_mul(alpha, r & 0xFF));
+        }
+    }
+}
+
+fn call_hline_blitter(
+    mut x: u32,
+    y: Option<u32>,
+    count: LengthU32,
+    alpha: AlphaU8,
+    blitter: &mut dyn Blitter,
+) {
+    const HLINE_STACK_BUFFER: usize = 100;
+
+    let mut runs = [None; HLINE_STACK_BUFFER + 1];
+    let mut aa = [0u8; HLINE_STACK_BUFFER];
+
+    let mut count = count.get();
+    loop {
+        // In theory, we should be able to just do this once (outside of the loop),
+        // since aa[] and runs[] are supposed" to be const when we call the blitter.
+        // In reality, some wrapper-blitters (e.g. RgnClipBlitter) cast away that
+        // constness, and modify the buffers in-place. Hence the need to be defensive
+        // here and reseed the aa value.
+        aa[0] = alpha;
+
+        let mut n = count;
+        if n > HLINE_STACK_BUFFER as u32 {
+            n = HLINE_STACK_BUFFER as u32;
+        }
+
+        debug_assert!(n <= core::u16::MAX as u32);
+        runs[0] = NonZeroU16::new(n as u16);
+        runs[n as usize] = None;
+        if let Some(y) = y {
+            blitter.blit_anti_h(x, y, &mut aa, &mut runs);
+        }
+        x += n;
+
+        if n >= count || count == 0 {
+            break;
+        }
+
+        count -= n;
+    }
+}
+
+pub fn stroke_path(
+    path: &Path,
+    line_cap: LineCap,
+    clip: &ScreenIntRect,
+    blitter: &mut dyn Blitter,
+) -> Option<()> {
+    super::hairline::stroke_path_impl(path, line_cap, clip, anti_hair_line_rgn, blitter)
+}
+
+fn anti_hair_line_rgn(
+    points: &[Point],
+    clip: Option<&ScreenIntRect>,
+    blitter: &mut dyn Blitter,
+) -> Option<()> {
+    let max = 32767.0;
+    let fixed_bounds = Rect::from_ltrb(-max, -max, max, max)?;
+
+    let clip_bounds = if let Some(clip) = clip {
+        // We perform integral clipping later on, but we do a scalar clip first
+        // to ensure that our coordinates are expressible in fixed/integers.
+        //
+        // antialiased hairlines can draw up to 1/2 of a pixel outside of
+        // their bounds, so we need to outset the clip before calling the
+        // clipper. To make the numerics safer, we outset by a whole pixel,
+        // since the 1/2 pixel boundary is important to the antihair blitter,
+        // we don't want to risk numerical fate by chopping on that edge.
+        clip.to_rect().outset(1.0, 1.0)
+    } else {
+        None
+    };
+
+    for i in 0..points.len() - 1 {
+        let mut pts = [Point::zero(); 2];
+
+        // We have to pre-clip the line to fit in a Fixed, so we just chop the line.
+        if !line_clipper::intersect(&[points[i], points[i + 1]], &fixed_bounds, &mut pts) {
+            continue;
+        }
+
+        if let Some(clip_bounds) = clip_bounds {
+            let tmp = pts;
+            if !line_clipper::intersect(&tmp, &clip_bounds, &mut pts) {
+                continue;
+            }
+        }
+
+        let x0 = fdot6::from_f32(pts[0].x);
+        let y0 = fdot6::from_f32(pts[0].y);
+        let x1 = fdot6::from_f32(pts[1].x);
+        let y1 = fdot6::from_f32(pts[1].y);
+
+        if let Some(clip) = clip {
+            let left = x0.min(x1);
+            let top = y0.min(y1);
+            let right = x0.max(x1);
+            let bottom = y0.max(y1);
+
+            let ir = IntRect::from_ltrb(
+                fdot6::floor(left) - 1,
+                fdot6::floor(top) - 1,
+                fdot6::ceil(right) + 1,
+                fdot6::ceil(bottom) + 1,
+            )?;
+
+            if clip.to_int_rect().intersect(&ir).is_none() {
+                continue;
+            }
+
+            if !clip.to_int_rect().contains(&ir) {
+                let subclip = ir
+                    .intersect(&clip.to_int_rect())
+                    .and_then(|r| r.to_screen_int_rect());
+
+                if let Some(subclip) = subclip {
+                    do_anti_hairline(x0, y0, x1, y1, Some(subclip), blitter);
+                }
+
+                continue;
+            }
+
+            // fall through to no-clip case
+        }
+
+        do_anti_hairline(x0, y0, x1, y1, None, blitter);
+    }
+
+    Some(())
+}
+
+#[derive(Copy, Clone, Debug)]
+enum BlitterKind {
+    HLine,
+    Horish,
+    VLine,
+    Vertish,
+}
+
+fn do_anti_hairline(
+    mut x0: FDot6,
+    mut y0: FDot6,
+    mut x1: FDot6,
+    mut y1: FDot6,
+    mut clip_opt: Option<ScreenIntRect>,
+    blitter: &mut dyn Blitter,
+) -> Option<()> {
+    // check for integer NaN (0x80000000) which we can't handle (can't negate it)
+    // It appears typically from a huge float (inf or nan) being converted to int.
+    // If we see it, just don't draw.
+    if any_bad_ints(x0, y0, x1, y1) != 0 {
+        return None;
+    }
+
+    // The caller must clip the line to [-32767.0 ... 32767.0] ahead of time  (in dot6 format)
+    debug_assert!(fdot6::can_convert_to_fdot16(x0));
+    debug_assert!(fdot6::can_convert_to_fdot16(y0));
+    debug_assert!(fdot6::can_convert_to_fdot16(x1));
+    debug_assert!(fdot6::can_convert_to_fdot16(y1));
+
+    if (x1 - x0).abs() > fdot6::from_i32(511) || (y1 - y0).abs() > fdot6::from_i32(511) {
+        // instead of (x0 + x1) >> 1, we shift each separately. This is less
+        // precise, but avoids overflowing the intermediate result if the
+        // values are huge. A better fix might be to clip the original pts
+        // directly (i.e. do the divide), so we don't spend time subdividing
+        // huge lines at all.
+        let hx = (x0 >> 1) + (x1 >> 1);
+        let hy = (y0 >> 1) + (y1 >> 1);
+        do_anti_hairline(x0, y0, hx, hy, clip_opt, blitter);
+        do_anti_hairline(hx, hy, x1, y1, clip_opt, blitter);
+        return Some(());
+    }
+
+    let mut scale_start;
+    let mut scale_stop;
+    let mut istart;
+    let mut istop;
+    let mut fstart;
+    let slope;
+    let blitter_kind;
+
+    if (x1 - x0).abs() > (y1 - y0).abs() {
+        // mostly horizontal
+
+        if x0 > x1 {
+            // we want to go left-to-right
+            core::mem::swap(&mut x0, &mut x1);
+            core::mem::swap(&mut y0, &mut y1);
+        }
+
+        istart = fdot6::floor(x0);
+        istop = fdot6::ceil(x1);
+        fstart = fdot6::to_fdot16(y0);
+        if y0 == y1 {
+            // completely horizontal, take fast case
+            slope = 0;
+            blitter_kind = Some(BlitterKind::HLine);
+        } else {
+            slope = fdot16::fast_div(y1 - y0, x1 - x0);
+            debug_assert!(slope >= -fdot16::ONE && slope <= fdot16::ONE);
+            fstart += (slope * (32 - (x0 & 63)) + 32) >> 6;
+            blitter_kind = Some(BlitterKind::Horish);
+        }
+
+        debug_assert!(istop > istart);
+        if istop - istart == 1 {
+            // we are within a single pixel
+            scale_start = x1 - x0;
+            debug_assert!(scale_start >= 0 && scale_start <= 64);
+            scale_stop = 0;
+        } else {
+            scale_start = 64 - (x0 & 63);
+            scale_stop = x1 & 63;
+        }
+
+        if let Some(clip) = clip_opt {
+            let clip = clip.to_int_rect();
+
+            if istart >= clip.right() || istop <= clip.left() {
+                return Some(());
+            }
+
+            if istart < clip.left() {
+                fstart += slope * (clip.left() - istart);
+                istart = clip.left();
+                scale_start = 64;
+                if istop - istart == 1 {
+                    // we are within a single pixel
+                    scale_start = contribution_64(x1);
+                    scale_stop = 0;
+                }
+            }
+
+            if istop > clip.right() {
+                istop = clip.right();
+                scale_stop = 0; // so we don't draw this last column
+            }
+
+            debug_assert!(istart <= istop);
+            if istart == istop {
+                return Some(());
+            }
+
+            // now test if our Y values are completely inside the clip
+            let (mut top, mut bottom) = if slope >= 0 {
+                // T2B
+                let top = fdot16::floor_to_i32(fstart - fdot16::HALF);
+                let bottom =
+                    fdot16::ceil_to_i32(fstart + (istop - istart - 1) * slope + fdot16::HALF);
+                (top, bottom)
+            } else {
+                // B2T
+                let bottom = fdot16::ceil_to_i32(fstart + fdot16::HALF);
+                let top =
+                    fdot16::floor_to_i32(fstart + (istop - istart - 1) * slope - fdot16::HALF);
+                (top, bottom)
+            };
+
+            top -= 1;
+            bottom += 1;
+
+            if top >= clip.bottom() || bottom <= clip.top() {
+                return Some(());
+            }
+
+            if clip.top() <= top && clip.bottom() >= bottom {
+                clip_opt = None;
+            }
+        }
+    } else {
+        // mostly vertical
+
+        if y0 > y1 {
+            // we want to go top-to-bottom
+            core::mem::swap(&mut x0, &mut x1);
+            core::mem::swap(&mut y0, &mut y1);
+        }
+
+        istart = fdot6::floor(y0);
+        istop = fdot6::ceil(y1);
+        fstart = fdot6::to_fdot16(x0);
+        if x0 == x1 {
+            if y0 == y1 {
+                // are we zero length? nothing to do
+                return Some(());
+            }
+
+            slope = 0;
+            blitter_kind = Some(BlitterKind::VLine);
+        } else {
+            slope = fdot16::fast_div(x1 - x0, y1 - y0);
+            debug_assert!(slope <= fdot16::ONE && slope >= -fdot16::ONE);
+            fstart += (slope * (32 - (y0 & 63)) + 32) >> 6;
+            blitter_kind = Some(BlitterKind::Vertish);
+        }
+
+        debug_assert!(istop > istart);
+        if istop - istart == 1 {
+            // we are within a single pixel
+            scale_start = y1 - y0;
+            debug_assert!(scale_start >= 0 && scale_start <= 64);
+            scale_stop = 0;
+        } else {
+            scale_start = 64 - (y0 & 63);
+            scale_stop = y1 & 63;
+        }
+
+        if let Some(clip) = clip_opt {
+            let clip = clip.to_int_rect();
+
+            if istart >= clip.bottom() || istop <= clip.top() {
+                return Some(());
+            }
+
+            if istart < clip.top() {
+                fstart += slope * (clip.top() - istart);
+                istart = clip.top();
+                scale_start = 64;
+                if istop - istart == 1 {
+                    // we are within a single pixel
+                    scale_start = contribution_64(y1);
+                    scale_stop = 0;
+                }
+            }
+            if istop > clip.bottom() {
+                istop = clip.bottom();
+                scale_stop = 0; // so we don't draw this last row
+            }
+
+            debug_assert!(istart <= istop);
+            if istart == istop {
+                return Some(());
+            }
+
+            // now test if our X values are completely inside the clip
+            let (mut left, mut right) = if slope >= 0 {
+                // L2R
+                let left = fdot16::floor_to_i32(fstart - fdot16::HALF);
+                let right =
+                    fdot16::ceil_to_i32(fstart + (istop - istart - 1) * slope + fdot16::HALF);
+                (left, right)
+            } else {
+                // R2L
+                let right = fdot16::ceil_to_i32(fstart + fdot16::HALF);
+                let left =
+                    fdot16::floor_to_i32(fstart + (istop - istart - 1) * slope - fdot16::HALF);
+                (left, right)
+            };
+
+            left -= 1;
+            right += 1;
+
+            if left >= clip.right() || right <= clip.left() {
+                return Some(());
+            }
+
+            if clip.left() <= left && clip.right() >= right {
+                clip_opt = None;
+            }
+        }
+    }
+
+    let mut clip_blitter;
+    let blitter = if let Some(clip) = clip_opt {
+        clip_blitter = RectClipBlitter { blitter, clip };
+        &mut clip_blitter
+    } else {
+        blitter
+    };
+
+    // A bit ugly, but looks like this is the only way to have stack allocated object trait.
+    let mut hline_blitter;
+    let mut horish_blitter;
+    let mut vline_blitter;
+    let mut vertish_blitter;
+    let hair_blitter: &mut dyn AntiHairBlitter = match blitter_kind? {
+        BlitterKind::HLine => {
+            hline_blitter = HLineAntiHairBlitter(blitter);
+            &mut hline_blitter
+        }
+        BlitterKind::Horish => {
+            horish_blitter = HorishAntiHairBlitter(blitter);
+            &mut horish_blitter
+        }
+        BlitterKind::VLine => {
+            vline_blitter = VLineAntiHairBlitter(blitter);
+            &mut vline_blitter
+        }
+        BlitterKind::Vertish => {
+            vertish_blitter = VertishAntiHairBlitter(blitter);
+            &mut vertish_blitter
+        }
+    };
+
+    debug_assert!(istart >= 0);
+    let mut istart = istart as u32;
+
+    debug_assert!(istop >= 0);
+    let istop = istop as u32;
+
+    fstart = hair_blitter.draw_cap(istart, fstart, slope, scale_start);
+    istart += 1;
+    let full_spans = istop - istart - (scale_stop > 0) as u32;
+    if full_spans > 0 {
+        fstart = hair_blitter.draw_line(istart, istart + full_spans, fstart, slope);
+    }
+
+    if scale_stop > 0 {
+        hair_blitter.draw_cap(istop - 1, fstart, slope, scale_stop);
+    }
+
+    Some(())
+}
+
+// returns high-bit set if x == 0x8000
+fn bad_int(x: i32) -> i32 {
+    x & -x
+}
+
+fn any_bad_ints(a: i32, b: i32, c: i32, d: i32) -> i32 {
+    (bad_int(a) | bad_int(b) | bad_int(c) | bad_int(d)) >> ((core::mem::size_of::<i32>() << 3) - 1)
+}
+
+// We want the fractional part of ordinate, but we want multiples of 64 to
+// return 64, not 0, so we can't just say (ordinate & 63).
+// We basically want to compute those bits, and if they're 0, return 64.
+// We can do that w/o a branch with an extra sub and add.
+fn contribution_64(ordinate: FDot6) -> i32 {
+    let result = ((ordinate - 1) & 63) + 1;
+    debug_assert!(result > 0 && result <= 64);
+    result
+}
+
+trait AntiHairBlitter {
+    fn draw_cap(&mut self, x: u32, fy: FDot16, slope: FDot16, mod64: i32) -> FDot16;
+    fn draw_line(&mut self, x: u32, stopx: u32, fy: FDot16, slope: FDot16) -> FDot16;
+}
+
+struct HLineAntiHairBlitter<'a>(&'a mut dyn Blitter);
+
+impl AntiHairBlitter for HLineAntiHairBlitter<'_> {
+    fn draw_cap(&mut self, x: u32, mut fy: FDot16, _: FDot16, mod64: i32) -> FDot16 {
+        fy += fdot16::ONE / 2;
+        fy = fy.max(0);
+
+        let y = (fy >> 16) as u32;
+        let a = i32_to_alpha(fy >> 8);
+
+        // lower line
+        let mut ma = fdot6::small_scale(a, mod64);
+        if ma != 0 {
+            call_hline_blitter(x, Some(y), LENGTH_U32_ONE, ma, self.0);
+        }
+
+        // upper line
+        ma = fdot6::small_scale(255 - a, mod64);
+        if ma != 0 {
+            call_hline_blitter(x, y.checked_sub(1), LENGTH_U32_ONE, ma, self.0);
+        }
+
+        fy - fdot16::ONE / 2
+    }
+
+    fn draw_line(&mut self, x: u32, stop_x: u32, mut fy: FDot16, _: FDot16) -> FDot16 {
+        let count = match LengthU32::new(stop_x - x) {
+            Some(n) => n,
+            None => return fy,
+        };
+
+        fy += fdot16::ONE / 2;
+        fy = fy.max(0);
+
+        let y = (fy >> 16) as u32;
+        let mut a = i32_to_alpha(fy >> 8);
+
+        // lower line
+        if a != 0 {
+            call_hline_blitter(x, Some(y), count, a, self.0);
+        }
+
+        // upper line
+        a = 255 - a;
+        if a != 0 {
+            call_hline_blitter(x, y.checked_sub(1), count, a, self.0);
+        }
+
+        fy - fdot16::ONE / 2
+    }
+}
+
+struct HorishAntiHairBlitter<'a>(&'a mut dyn Blitter);
+
+impl AntiHairBlitter for HorishAntiHairBlitter<'_> {
+    fn draw_cap(&mut self, x: u32, mut fy: FDot16, dy: FDot16, mod64: i32) -> FDot16 {
+        fy += fdot16::ONE / 2;
+        fy = fy.max(0);
+
+        let lower_y = (fy >> 16) as u32;
+        let a = i32_to_alpha(fy >> 8);
+        let a0 = fdot6::small_scale(255 - a, mod64);
+        let a1 = fdot6::small_scale(a, mod64);
+        self.0.blit_anti_v2(x, lower_y.max(1) - 1, a0, a1);
+
+        fy + dy - fdot16::ONE / 2
+    }
+
+    fn draw_line(&mut self, mut x: u32, stop_x: u32, mut fy: FDot16, dy: FDot16) -> FDot16 {
+        debug_assert!(x < stop_x);
+
+        fy += fdot16::ONE / 2;
+        loop {
+            fy = fy.max(0);
+            let lower_y = (fy >> 16) as u32;
+            let a = i32_to_alpha(fy >> 8);
+            self.0.blit_anti_v2(x, lower_y.max(1) - 1, 255 - a, a);
+            fy += dy;
+
+            x += 1;
+            if x >= stop_x {
+                break;
+            }
+        }
+
+        fy - fdot16::ONE / 2
+    }
+}
+
+struct VLineAntiHairBlitter<'a>(&'a mut dyn Blitter);
+
+impl AntiHairBlitter for VLineAntiHairBlitter<'_> {
+    fn draw_cap(&mut self, y: u32, mut fx: FDot16, dx: FDot16, mod64: i32) -> FDot16 {
+        debug_assert!(dx == 0);
+        fx += fdot16::ONE / 2;
+        fx = fx.max(0);
+
+        let x = (fx >> 16) as u32;
+        let a = i32_to_alpha(fx >> 8);
+
+        let mut ma = fdot6::small_scale(a, mod64);
+        if ma != 0 {
+            self.0.blit_v(x, y, LENGTH_U32_ONE, ma);
+        }
+
+        ma = fdot6::small_scale(255 - a, mod64);
+        if ma != 0 {
+            self.0.blit_v(x.max(1) - 1, y, LENGTH_U32_ONE, ma);
+        }
+
+        fx - fdot16::ONE / 2
+    }
+
+    fn draw_line(&mut self, y: u32, stop_y: u32, mut fx: FDot16, dx: FDot16) -> FDot16 {
+        debug_assert!(dx == 0);
+        let height = match LengthU32::new(stop_y - y) {
+            Some(n) => n,
+            None => return fx,
+        };
+
+        fx += fdot16::ONE / 2;
+        fx = fx.max(0);
+
+        let x = (fx >> 16) as u32;
+        let mut a = i32_to_alpha(fx >> 8);
+
+        if a != 0 {
+            self.0.blit_v(x, y, height, a);
+        }
+
+        a = 255 - a;
+        if a != 0 {
+            self.0.blit_v(x.max(1) - 1, y, height, a);
+        }
+
+        fx - fdot16::ONE / 2
+    }
+}
+
+struct VertishAntiHairBlitter<'a>(&'a mut dyn Blitter);
+
+impl AntiHairBlitter for VertishAntiHairBlitter<'_> {
+    fn draw_cap(&mut self, y: u32, mut fx: FDot16, dx: FDot16, mod64: i32) -> FDot16 {
+        fx += fdot16::ONE / 2;
+        fx = fx.max(0);
+
+        let x = (fx >> 16) as u32;
+        let a = i32_to_alpha(fx >> 8);
+        self.0.blit_anti_h2(
+            x.max(1) - 1,
+            y,
+            fdot6::small_scale(255 - a, mod64),
+            fdot6::small_scale(a, mod64),
+        );
+
+        fx + dx - fdot16::ONE / 2
+    }
+
+    fn draw_line(&mut self, mut y: u32, stop_y: u32, mut fx: FDot16, dx: FDot16) -> FDot16 {
+        debug_assert!(y < stop_y);
+
+        fx += fdot16::ONE / 2;
+        loop {
+            fx = fx.max(0);
+            let x = (fx >> 16) as u32;
+            let a = i32_to_alpha(fx >> 8);
+            self.0.blit_anti_h2(x.max(1) - 1, y, 255 - a, a);
+            fx += dx;
+
+            y += 1;
+            if y >= stop_y {
+                break;
+            }
+        }
+
+        fx - fdot16::ONE / 2
+    }
+}
+
+fn i32_to_alpha(a: i32) -> u8 {
+    (a & 0xFF) as u8
+}
+
+struct RectClipBlitter<'a> {
+    blitter: &'a mut dyn Blitter,
+    clip: ScreenIntRect,
+}
+
+impl Blitter for RectClipBlitter<'_> {
+    fn blit_anti_h(
+        &mut self,
+        x: u32,
+        y: u32,
+        mut antialias: &mut [AlphaU8],
+        mut runs: &mut [AlphaRun],
+    ) {
+        fn y_in_rect(y: u32, rect: ScreenIntRect) -> bool {
+            (y - rect.top()) < rect.height()
+        }
+
+        if !y_in_rect(y, self.clip) || x >= self.clip.right() {
+            return;
+        }
+
+        let mut x0 = x;
+        let mut x1 = x + compute_anti_width(runs);
+
+        if x1 <= self.clip.left() {
+            return;
+        }
+
+        debug_assert!(x0 < x1);
+        if x0 < self.clip.left() {
+            let dx = self.clip.left() - x0;
+            AlphaRuns::break_at(antialias, runs, dx as i32);
+            antialias = &mut antialias[dx as usize..];
+            runs = &mut runs[dx as usize..];
+            x0 = self.clip.left();
+        }
+
+        debug_assert!(x0 < x1 && runs[(x1 - x0) as usize].is_none());
+        if x1 > self.clip.right() {
+            x1 = self.clip.right();
+            AlphaRuns::break_at(antialias, runs, (x1 - x0) as i32);
+            runs[(x1 - x0) as usize] = None;
+        }
+
+        debug_assert!(x0 < x1 && runs[(x1 - x0) as usize].is_none());
+        debug_assert!(compute_anti_width(runs) == x1 - x0);
+
+        self.blitter.blit_anti_h(x0, y, antialias, runs);
+    }
+
+    fn blit_v(&mut self, x: u32, y: u32, height: LengthU32, alpha: AlphaU8) {
+        fn x_in_rect(x: u32, rect: ScreenIntRect) -> bool {
+            (x - rect.left()) < rect.width()
+        }
+
+        if !x_in_rect(x, self.clip) {
+            return;
+        }
+
+        let mut y0 = y;
+        let mut y1 = y + height.get();
+
+        if y0 < self.clip.top() {
+            y0 = self.clip.top();
+        }
+
+        if y1 > self.clip.bottom() {
+            y1 = self.clip.bottom();
+        }
+
+        if y0 < y1 {
+            if let Some(h) = LengthU32::new(y1 - y0) {
+                self.blitter.blit_v(x, y0, h, alpha);
+            }
+        }
+    }
+
+    fn blit_anti_h2(&mut self, x: u32, y: u32, alpha0: AlphaU8, alpha1: AlphaU8) {
+        self.blit_anti_h(
+            x,
+            y,
+            &mut [alpha0, alpha1],
+            &mut [NonZeroU16::new(1), NonZeroU16::new(1), None],
+        );
+    }
+
+    fn blit_anti_v2(&mut self, x: u32, y: u32, alpha0: AlphaU8, alpha1: AlphaU8) {
+        self.blit_anti_h(x, y, &mut [alpha0], &mut [NonZeroU16::new(1), None]);
+
+        self.blit_anti_h(x, y + 1, &mut [alpha1], &mut [NonZeroU16::new(1), None]);
+    }
+}
+
+fn compute_anti_width(runs: &[AlphaRun]) -> u32 {
+    let mut i = 0;
+    let mut width = 0;
+    while let Some(count) = runs[i] {
+        width += u32::from(count.get());
+        i += usize::from(count.get());
+    }
+
+    width
+}
--- a/third-party/vendor/tiny-skia/src/scan/mod.rs
+++ b/third-party/vendor/tiny-skia/src/scan/mod.rs
@ -0,0 +1,30 @@
+// Copyright 2011 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+pub mod hairline;
+pub mod hairline_aa;
+pub mod path;
+pub mod path_aa;
+
+use tiny_skia_path::ScreenIntRect;
+
+use crate::{IntRect, Rect};
+
+use crate::blitter::Blitter;
+
+pub fn fill_rect(rect: &Rect, clip: &ScreenIntRect, blitter: &mut dyn Blitter) -> Option<()> {
+    fill_int_rect(&rect.round()?, clip, blitter)
+}
+
+fn fill_int_rect(rect: &IntRect, clip: &ScreenIntRect, blitter: &mut dyn Blitter) -> Option<()> {
+    let rect = rect.intersect(&clip.to_int_rect())?.to_screen_int_rect()?;
+    blitter.blit_rect(&rect);
+    Some(())
+}
+
+pub fn fill_rect_aa(rect: &Rect, clip: &ScreenIntRect, blitter: &mut dyn Blitter) -> Option<()> {
+    hairline_aa::fill_rect(rect, clip, blitter)
+}
--- a/third-party/vendor/tiny-skia/src/scan/path.rs
+++ b/third-party/vendor/tiny-skia/src/scan/path.rs
@ -0,0 +1,388 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use core::convert::TryFrom;
+
+use tiny_skia_path::{SaturateCast, ScreenIntRect};
+
+use crate::{FillRule, IntRect, LengthU32, Path, Rect};
+
+use crate::blitter::Blitter;
+use crate::edge::{Edge, LineEdge};
+use crate::edge_builder::{BasicEdgeBuilder, ShiftedIntRect};
+use crate::fixed_point::{fdot16, fdot6, FDot16};
+
+#[cfg(all(not(feature = "std"), feature = "no-std-float"))]
+use tiny_skia_path::NoStdFloat;
+
+pub fn fill_path(
+    path: &Path,
+    fill_rule: FillRule,
+    clip: &ScreenIntRect,
+    blitter: &mut dyn Blitter,
+) -> Option<()> {
+    let ir = conservative_round_to_int(&path.bounds())?;
+
+    let path_contained_in_clip = if let Some(bounds) = ir.to_screen_int_rect() {
+        clip.contains(&bounds)
+    } else {
+        // If bounds cannot be converted into ScreenIntRect,
+        // the path is out of clip.
+        false
+    };
+
+    // TODO: SkScanClipper
+
+    fill_path_impl(
+        path,
+        fill_rule,
+        clip,
+        ir.y(),
+        ir.bottom(),
+        0,
+        path_contained_in_clip,
+        blitter,
+    )
+}
+
+// Conservative rounding function, which effectively nudges the int-rect to be slightly larger
+// than Rect::round() might have produced. This is a safety-net for the scan-converter, which
+// inspects the returned int-rect, and may disable clipping (for speed) if it thinks all of the
+// edges will fit inside the clip's bounds. The scan-converter introduces slight numeric errors
+// due to accumulated += of the slope, so this function is used to return a conservatively large
+// int-bounds, and thus we will only disable clipping if we're sure the edges will stay in-bounds.
+fn conservative_round_to_int(src: &Rect) -> Option<IntRect> {
+    // We must use `from_ltrb`, otherwise rounding will be incorrect.
+    IntRect::from_ltrb(
+        round_down_to_int(src.left()),
+        round_down_to_int(src.top()),
+        round_up_to_int(src.right()),
+        round_up_to_int(src.bottom()),
+    )
+}
+
+// Bias used for conservative rounding of float rects to int rects, to nudge the irects a little
+// larger, so we don't "think" a path's bounds are inside a clip, when (due to numeric drift in
+// the scan-converter) we might walk beyond the predicted limits.
+//
+// This value has been determined trial and error: pick the smallest value (after the 0.5) that
+// fixes any problematic cases (e.g. crbug.com/844457)
+// NOTE: cubics appear to be the main reason for needing this slop. If we could (perhaps) have a
+// more accurate walker for cubics, we may be able to reduce this fudge factor.
+const CONSERVATIVE_ROUND_BIAS: f64 = 0.5 + 1.5 / fdot6::ONE as f64;
+
+// Round the value down. This is used to round the top and left of a rectangle,
+// and corresponds to the way the scan converter treats the top and left edges.
+// It has a slight bias to make the "rounded" int smaller than a normal round, to create a more
+// conservative int-bounds (larger) from a float rect.
+fn round_down_to_int(x: f32) -> i32 {
+    let mut xx = x as f64;
+    xx -= CONSERVATIVE_ROUND_BIAS;
+    i32::saturate_from(xx.ceil())
+}
+
+// Round the value up. This is used to round the right and bottom of a rectangle.
+// It has a slight bias to make the "rounded" int smaller than a normal round, to create a more
+// conservative int-bounds (larger) from a float rect.
+fn round_up_to_int(x: f32) -> i32 {
+    let mut xx = x as f64;
+    xx += CONSERVATIVE_ROUND_BIAS;
+    i32::saturate_from(xx.floor())
+}
+
+pub fn fill_path_impl(
+    path: &Path,
+    fill_rule: FillRule,
+    clip_rect: &ScreenIntRect,
+    mut start_y: i32,
+    mut stop_y: i32,
+    shift_edges_up: i32,
+    path_contained_in_clip: bool,
+    blitter: &mut dyn Blitter,
+) -> Option<()> {
+    let shifted_clip = ShiftedIntRect::new(clip_rect, shift_edges_up)?;
+    let clip = if path_contained_in_clip {
+        None
+    } else {
+        Some(&shifted_clip)
+    };
+    let mut edges = BasicEdgeBuilder::build_edges(path, clip, shift_edges_up)?;
+
+    edges.sort_by(|a, b| {
+        let mut value_a = a.as_line().first_y;
+        let mut value_b = b.as_line().first_y;
+
+        if value_a == value_b {
+            value_a = a.as_line().x;
+            value_b = b.as_line().x;
+        }
+
+        value_a.cmp(&value_b)
+    });
+
+    for i in 0..edges.len() {
+        // 0 will be set later, so start with 1.
+        edges[i].prev = Some(i as u32 + 0);
+        edges[i].next = Some(i as u32 + 2);
+    }
+
+    const EDGE_HEAD_Y: i32 = i32::MIN;
+    const EDGE_TAIL_Y: i32 = i32::MAX;
+
+    edges.insert(
+        0,
+        Edge::Line(LineEdge {
+            prev: None,
+            next: Some(1),
+            x: i32::MIN,
+            first_y: EDGE_HEAD_Y,
+            ..LineEdge::default()
+        }),
+    );
+
+    edges.push(Edge::Line(LineEdge {
+        prev: Some(edges.len() as u32 - 1),
+        next: None,
+        first_y: EDGE_TAIL_Y,
+        ..LineEdge::default()
+    }));
+
+    start_y <<= shift_edges_up;
+    stop_y <<= shift_edges_up;
+
+    let top = shifted_clip.shifted().y() as i32;
+    if !path_contained_in_clip && start_y < top {
+        start_y = top;
+    }
+
+    let bottom = shifted_clip.shifted().bottom() as i32;
+    if !path_contained_in_clip && stop_y > bottom as i32 {
+        stop_y = bottom as i32;
+    }
+
+    let start_y = u32::try_from(start_y).ok()?;
+    let stop_y = u32::try_from(stop_y).ok()?;
+
+    // TODO: walk_simple_edges
+
+    walk_edges(
+        fill_rule,
+        start_y,
+        stop_y,
+        shifted_clip.shifted().right(),
+        &mut edges,
+        blitter,
+    );
+    Some(())
+}
+
+// TODO: simplify!
+fn walk_edges(
+    fill_rule: FillRule,
+    start_y: u32,
+    stop_y: u32,
+    right_clip: u32,
+    edges: &mut [Edge],
+    blitter: &mut dyn Blitter,
+) {
+    let mut curr_y = start_y;
+    let winding_mask = if fill_rule == FillRule::EvenOdd {
+        1
+    } else {
+        -1
+    };
+
+    loop {
+        let mut w = 0i32;
+        let mut left = 0u32;
+        let mut prev_x = edges[0].x;
+
+        let mut curr_idx = edges[0].next.unwrap() as usize;
+        while edges[curr_idx].first_y <= curr_y as i32 {
+            debug_assert!(edges[curr_idx].last_y >= curr_y as i32);
+
+            let x = fdot16::round_to_i32(edges[curr_idx].x) as u32; // TODO: check
+
+            if (w & winding_mask) == 0 {
+                // we're starting interval
+                left = x;
+            }
+
+            w += i32::from(edges[curr_idx].winding);
+
+            if (w & winding_mask) == 0 {
+                // we finished an interval
+                if let Some(width) = LengthU32::new(x - left) {
+                    blitter.blit_h(left, curr_y, width);
+                }
+            }
+
+            let next_idx = edges[curr_idx].next.unwrap();
+            let new_x;
+
+            if edges[curr_idx].last_y == curr_y as i32 {
+                // are we done with this edge?
+                match &mut edges[curr_idx] {
+                    Edge::Line(_) => {
+                        remove_edge(curr_idx, edges);
+                    }
+                    Edge::Quadratic(ref mut quad) => {
+                        if quad.curve_count > 0 && quad.update() {
+                            new_x = quad.line.x;
+
+                            if new_x < prev_x {
+                                // ripple current edge backwards until it is x-sorted
+                                backward_insert_edge_based_on_x(curr_idx, edges);
+                            } else {
+                                prev_x = new_x;
+                            }
+                        } else {
+                            remove_edge(curr_idx, edges);
+                        }
+                    }
+                    Edge::Cubic(ref mut cubic) => {
+                        if cubic.curve_count < 0 && cubic.update() {
+                            debug_assert!(cubic.line.first_y == curr_y as i32 + 1);
+
+                            new_x = cubic.line.x;
+
+                            if new_x < prev_x {
+                                // ripple current edge backwards until it is x-sorted
+                                backward_insert_edge_based_on_x(curr_idx, edges);
+                            } else {
+                                prev_x = new_x;
+                            }
+                        } else {
+                            remove_edge(curr_idx, edges);
+                        }
+                    }
+                }
+            } else {
+                debug_assert!(edges[curr_idx].last_y > curr_y as i32);
+                new_x = edges[curr_idx].x + edges[curr_idx].dx;
+                edges[curr_idx].x = new_x;
+
+                if new_x < prev_x {
+                    // ripple current edge backwards until it is x-sorted
+                    backward_insert_edge_based_on_x(curr_idx, edges);
+                } else {
+                    prev_x = new_x;
+                }
+            }
+
+            curr_idx = next_idx as usize;
+        }
+
+        if (w & winding_mask) != 0 {
+            // was our right-edge culled away?
+            if let Some(width) = LengthU32::new(right_clip - left) {
+                blitter.blit_h(left, curr_y, width);
+            }
+        }
+
+        curr_y += 1;
+        if curr_y >= stop_y {
+            break;
+        }
+
+        // now current edge points to the first edge with a Yint larger than curr_y
+        insert_new_edges(curr_idx, curr_y as i32, edges);
+    }
+}
+
+fn remove_edge(curr_idx: usize, edges: &mut [Edge]) {
+    let prev = edges[curr_idx].prev.unwrap();
+    let next = edges[curr_idx].next.unwrap();
+
+    edges[prev as usize].next = Some(next);
+    edges[next as usize].prev = Some(prev);
+}
+
+fn backward_insert_edge_based_on_x(curr_idx: usize, edges: &mut [Edge]) {
+    let x = edges[curr_idx].x;
+    let mut prev_idx = edges[curr_idx].prev.unwrap() as usize;
+    while prev_idx != 0 {
+        if edges[prev_idx].x > x {
+            prev_idx = edges[prev_idx].prev.unwrap() as usize;
+        } else {
+            break;
+        }
+    }
+
+    let next_idx = edges[prev_idx].next.unwrap() as usize;
+    if next_idx != curr_idx {
+        remove_edge(curr_idx, edges);
+        insert_edge_after(curr_idx, prev_idx, edges);
+    }
+}
+
+fn insert_edge_after(curr_idx: usize, after_idx: usize, edges: &mut [Edge]) {
+    edges[curr_idx].prev = Some(after_idx as u32);
+    edges[curr_idx].next = edges[after_idx].next;
+
+    let after_next_idx = edges[after_idx].next.unwrap() as usize;
+    edges[after_next_idx].prev = Some(curr_idx as u32);
+    edges[after_idx].next = Some(curr_idx as u32);
+}
+
+// Start from the right side, searching backwards for the point to begin the new edge list
+// insertion, marching forwards from here. The implementation could have started from the left
+// of the prior insertion, and search to the right, or with some additional caching, binary
+// search the starting point. More work could be done to determine optimal new edge insertion.
+fn backward_insert_start(mut prev_idx: usize, x: FDot16, edges: &mut [Edge]) -> usize {
+    while let Some(prev) = edges[prev_idx].prev {
+        prev_idx = prev as usize;
+        if edges[prev_idx].x <= x {
+            break;
+        }
+    }
+
+    prev_idx
+}
+
+fn insert_new_edges(mut new_idx: usize, curr_y: i32, edges: &mut [Edge]) {
+    if edges[new_idx].first_y != curr_y {
+        return;
+    }
+
+    let prev_idx = edges[new_idx].prev.unwrap() as usize;
+    if edges[prev_idx].x <= edges[new_idx].x {
+        return;
+    }
+
+    // find first x pos to insert
+    let mut start_idx = backward_insert_start(prev_idx, edges[new_idx].x, edges);
+    // insert the lot, fixing up the links as we go
+    loop {
+        let next_idx = edges[new_idx].next.unwrap() as usize;
+        let mut keep_edge = false;
+        loop {
+            let after_idx = edges[start_idx].next.unwrap() as usize;
+            if after_idx == new_idx {
+                keep_edge = true;
+                break;
+            }
+
+            if edges[after_idx].x >= edges[new_idx].x {
+                break;
+            }
+
+            start_idx = after_idx;
+        }
+
+        if !keep_edge {
+            remove_edge(new_idx, edges);
+            insert_edge_after(new_idx, start_idx, edges);
+        }
+
+        start_idx = new_idx;
+        new_idx = next_idx;
+
+        if edges[new_idx].first_y != curr_y {
+            break;
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/src/scan/path_aa.rs
+++ b/third-party/vendor/tiny-skia/src/scan/path_aa.rs
@ -0,0 +1,275 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use core::convert::TryFrom;
+
+use tiny_skia_path::ScreenIntRect;
+
+use crate::{FillRule, IntRect, LengthU32, Path, Rect};
+
+use crate::alpha_runs::AlphaRuns;
+use crate::blitter::Blitter;
+use crate::color::AlphaU8;
+use crate::math::left_shift;
+
+#[cfg(all(not(feature = "std"), feature = "no-std-float"))]
+use tiny_skia_path::NoStdFloat;
+
+/// controls how much we super-sample (when we use that scan conversion)
+const SUPERSAMPLE_SHIFT: u32 = 2;
+
+const SHIFT: u32 = SUPERSAMPLE_SHIFT;
+const SCALE: u32 = 1 << SHIFT;
+const MASK: u32 = SCALE - 1;
+
+pub fn fill_path(
+    path: &Path,
+    fill_rule: FillRule,
+    clip: &ScreenIntRect,
+    blitter: &mut dyn Blitter,
+) -> Option<()> {
+    // Unlike `path.bounds.to_rect()?.round_out()`,
+    // this method rounds out first and then converts into a Rect.
+    let ir = Rect::from_ltrb(
+        path.bounds().left().floor(),
+        path.bounds().top().floor(),
+        path.bounds().right().ceil(),
+        path.bounds().bottom().ceil(),
+    )?
+    .round_out()?;
+
+    // If the intersection of the path bounds and the clip bounds
+    // will overflow 32767 when << by SHIFT, we can't supersample,
+    // so draw without antialiasing.
+    let clipped_ir = ir.intersect(&clip.to_int_rect())?;
+    if rect_overflows_short_shift(&clipped_ir, SHIFT as i32) != 0 {
+        return super::path::fill_path(path, fill_rule, clip, blitter);
+    }
+
+    // Our antialiasing can't handle a clip larger than 32767.
+    // TODO: skia actually limits the clip to 32767
+    {
+        const MAX_CLIP_COORD: u32 = 32767;
+        if clip.right() > MAX_CLIP_COORD || clip.bottom() > MAX_CLIP_COORD {
+            return None;
+        }
+    }
+
+    // TODO: SkScanClipper
+    // TODO: AAA
+
+    fill_path_impl(path, fill_rule, &ir, clip, blitter)
+}
+
+// Would any of the coordinates of this rectangle not fit in a short,
+// when left-shifted by shift?
+fn rect_overflows_short_shift(rect: &IntRect, shift: i32) -> i32 {
+    debug_assert!(overflows_short_shift(8191, shift) == 0);
+    debug_assert!(overflows_short_shift(8192, shift) != 0);
+    debug_assert!(overflows_short_shift(32767, 0) == 0);
+    debug_assert!(overflows_short_shift(32768, 0) != 0);
+
+    // Since we expect these to succeed, we bit-or together
+    // for a tiny extra bit of speed.
+    overflows_short_shift(rect.left(), shift)
+        | overflows_short_shift(rect.top(), shift)
+        | overflows_short_shift(rect.right(), shift)
+        | overflows_short_shift(rect.bottom(), shift)
+}
+
+fn overflows_short_shift(value: i32, shift: i32) -> i32 {
+    let s = 16 + shift;
+    (left_shift(value, s) >> s) - value
+}
+
+fn fill_path_impl(
+    path: &Path,
+    fill_rule: FillRule,
+    bounds: &IntRect,
+    clip: &ScreenIntRect,
+    blitter: &mut dyn Blitter,
+) -> Option<()> {
+    // TODO: MaskSuperBlitter
+
+    // TODO: 15% slower than skia, find out why
+    let mut blitter = SuperBlitter::new(bounds, clip, blitter)?;
+
+    let path_contained_in_clip = if let Some(bounds) = bounds.to_screen_int_rect() {
+        clip.contains(&bounds)
+    } else {
+        // If bounds cannot be converted into ScreenIntRect,
+        // the path is out of clip.
+        false
+    };
+
+    super::path::fill_path_impl(
+        path,
+        fill_rule,
+        clip,
+        bounds.top(),
+        bounds.bottom(),
+        SHIFT as i32,
+        path_contained_in_clip,
+        &mut blitter,
+    )
+}
+
+struct BaseSuperBlitter<'a> {
+    real_blitter: &'a mut dyn Blitter,
+
+    /// Current y coordinate, in destination coordinates.
+    curr_iy: i32,
+    /// Widest row of region to be blitted, in destination coordinates.
+    width: LengthU32,
+    /// Leftmost x coordinate in any row, in destination coordinates.
+    left: u32,
+    /// Leftmost x coordinate in any row, in supersampled coordinates.
+    super_left: u32,
+
+    /// Current y coordinate in supersampled coordinates.
+    curr_y: i32,
+    /// Initial y coordinate (top of bounds).
+    top: i32,
+}
+
+impl<'a> BaseSuperBlitter<'a> {
+    fn new(
+        bounds: &IntRect,
+        clip_rect: &ScreenIntRect,
+        blitter: &'a mut dyn Blitter,
+    ) -> Option<Self> {
+        let sect = bounds
+            .intersect(&clip_rect.to_int_rect())?
+            .to_screen_int_rect()?;
+        Some(BaseSuperBlitter {
+            real_blitter: blitter,
+            curr_iy: sect.top() as i32 - 1,
+            width: sect.width_safe(),
+            left: sect.left(),
+            super_left: sect.left() << SHIFT,
+            curr_y: (sect.top() << SHIFT) as i32 - 1,
+            top: sect.top() as i32,
+        })
+    }
+}
+
+struct SuperBlitter<'a> {
+    base: BaseSuperBlitter<'a>,
+    runs: AlphaRuns,
+    offset_x: usize,
+}
+
+impl<'a> SuperBlitter<'a> {
+    fn new(
+        bounds: &IntRect,
+        clip_rect: &ScreenIntRect,
+        blitter: &'a mut dyn Blitter,
+    ) -> Option<Self> {
+        let base = BaseSuperBlitter::new(bounds, clip_rect, blitter)?;
+        let runs_width = base.width;
+        Some(SuperBlitter {
+            base,
+            runs: AlphaRuns::new(runs_width),
+            offset_x: 0,
+        })
+    }
+
+    /// Once `runs` contains a complete supersampled row, flush() blits
+    /// it out through the wrapped blitter.
+    fn flush(&mut self) {
+        if self.base.curr_iy >= self.base.top {
+            if !self.runs.is_empty() {
+                self.base.real_blitter.blit_anti_h(
+                    self.base.left,
+                    u32::try_from(self.base.curr_iy).unwrap(),
+                    &mut self.runs.alpha,
+                    &mut self.runs.runs,
+                );
+                self.runs.reset(self.base.width);
+                self.offset_x = 0;
+            }
+
+            self.base.curr_iy = self.base.top - 1;
+        }
+    }
+}
+
+impl Drop for SuperBlitter<'_> {
+    fn drop(&mut self) {
+        self.flush();
+    }
+}
+
+impl Blitter for SuperBlitter<'_> {
+    /// Blits a row of pixels, with location and width specified
+    /// in supersampled coordinates.
+    fn blit_h(&mut self, mut x: u32, y: u32, mut width: LengthU32) {
+        let iy = (y >> SHIFT) as i32;
+        debug_assert!(iy >= self.base.curr_iy);
+
+        // hack, until I figure out why my cubics (I think) go beyond the bounds
+        match x.checked_sub(self.base.super_left) {
+            Some(n) => x = n,
+            None => {
+                width = LengthU32::new(x + width.get()).unwrap();
+                x = 0;
+            }
+        }
+
+        debug_assert!(y as i32 >= self.base.curr_y);
+        if self.base.curr_y != y as i32 {
+            self.offset_x = 0;
+            self.base.curr_y = y as i32;
+        }
+
+        if iy != self.base.curr_iy {
+            // new scanline
+            self.flush();
+            self.base.curr_iy = iy;
+        }
+
+        let start = x;
+        let stop = x + width.get();
+
+        debug_assert!(stop > start);
+        // integer-pixel-aligned ends of blit, rounded out
+        let mut fb = start & MASK;
+        let mut fe = stop & MASK;
+        let mut n: i32 = (stop as i32 >> SHIFT) - (start as i32 >> SHIFT) - 1;
+
+        if n < 0 {
+            fb = fe - fb;
+            n = 0;
+            fe = 0;
+        } else {
+            if fb == 0 {
+                n += 1;
+            } else {
+                fb = SCALE - fb;
+            }
+        }
+
+        let max_value = u8::try_from((1 << (8 - SHIFT)) - (((y & MASK) + 1) >> SHIFT)).unwrap();
+        self.offset_x = self.runs.add(
+            x >> SHIFT,
+            coverage_to_partial_alpha(fb),
+            n as usize,
+            coverage_to_partial_alpha(fe),
+            max_value,
+            self.offset_x,
+        );
+    }
+}
+
+// coverage_to_partial_alpha() is being used by AlphaRuns, which
+// *accumulates* SCALE pixels worth of "alpha" in [0,(256/SCALE)]
+// to produce a final value in [0, 255] and handles clamping 256->255
+// itself, with the same (alpha - (alpha >> 8)) correction as
+// coverage_to_exact_alpha().
+fn coverage_to_partial_alpha(mut aa: u32) -> AlphaU8 {
+    aa <<= 8 - 2 * SHIFT;
+    aa as AlphaU8
+}
--- a/third-party/vendor/tiny-skia/src/shaders/gradient.rs
+++ b/third-party/vendor/tiny-skia/src/shaders/gradient.rs
@ -0,0 +1,261 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use alloc::vec::Vec;
+
+use tiny_skia_path::{NormalizedF32, Scalar};
+
+use crate::{Color, SpreadMode, Transform};
+
+use crate::pipeline::RasterPipelineBuilder;
+use crate::pipeline::{self, EvenlySpaced2StopGradientCtx, GradientColor, GradientCtx};
+
+// The default SCALAR_NEARLY_ZERO threshold of .0024 is too big and causes regressions for svg
+// gradients defined in the wild.
+pub const DEGENERATE_THRESHOLD: f32 = 1.0 / (1 << 15) as f32;
+
+/// A gradient point.
+#[allow(missing_docs)]
+#[derive(Clone, Copy, PartialEq, Debug)]
+pub struct GradientStop {
+    pub(crate) position: NormalizedF32,
+    pub(crate) color: Color,
+}
+
+impl GradientStop {
+    /// Creates a new gradient point.
+    ///
+    /// `position` will be clamped to a 0..=1 range.
+    pub fn new(position: f32, color: Color) -> Self {
+        GradientStop {
+            position: NormalizedF32::new_clamped(position),
+            color,
+        }
+    }
+}
+
+#[derive(Clone, PartialEq, Debug)]
+pub struct Gradient {
+    stops: Vec<GradientStop>,
+    tile_mode: SpreadMode,
+    pub(crate) transform: Transform,
+    points_to_unit: Transform,
+    pub(crate) colors_are_opaque: bool,
+    has_uniform_stops: bool,
+}
+
+impl Gradient {
+    pub fn new(
+        mut stops: Vec<GradientStop>,
+        tile_mode: SpreadMode,
+        transform: Transform,
+        points_to_unit: Transform,
+    ) -> Self {
+        debug_assert!(stops.len() > 1);
+
+        // Note: we let the caller skip the first and/or last position.
+        // i.e. pos[0] = 0.3, pos[1] = 0.7
+        // In these cases, we insert dummy entries to ensure that the final data
+        // will be bracketed by [0, 1].
+        // i.e. our_pos[0] = 0, our_pos[1] = 0.3, our_pos[2] = 0.7, our_pos[3] = 1
+        let dummy_first = stops[0].position.get() != 0.0;
+        let dummy_last = stops[stops.len() - 1].position.get() != 1.0;
+
+        // Now copy over the colors, adding the dummies as needed.
+        if dummy_first {
+            stops.insert(0, GradientStop::new(0.0, stops[0].color));
+        }
+
+        if dummy_last {
+            stops.push(GradientStop::new(1.0, stops[stops.len() - 1].color));
+        }
+
+        let colors_are_opaque = stops.iter().all(|p| p.color.is_opaque());
+
+        // Pin the last value to 1.0, and make sure positions are monotonic.
+        let start_index = if dummy_first { 0 } else { 1 };
+        let mut prev = 0.0;
+        let mut has_uniform_stops = true;
+        let uniform_step = stops[start_index].position.get() - prev;
+        for i in start_index..stops.len() {
+            let curr = if i + 1 == stops.len() {
+                // The last one must be zero.
+                1.0
+            } else {
+                stops[i].position.get().bound(prev, 1.0)
+            };
+
+            has_uniform_stops &= uniform_step.is_nearly_equal(curr - prev);
+            stops[i].position = NormalizedF32::new_clamped(curr);
+            prev = curr;
+        }
+
+        Gradient {
+            stops,
+            tile_mode,
+            transform,
+            points_to_unit,
+            colors_are_opaque,
+            has_uniform_stops,
+        }
+    }
+
+    pub fn push_stages(
+        &self,
+        p: &mut RasterPipelineBuilder,
+        push_stages_pre: &dyn Fn(&mut RasterPipelineBuilder),
+        push_stages_post: &dyn Fn(&mut RasterPipelineBuilder),
+    ) -> Option<()> {
+        p.push(pipeline::Stage::SeedShader);
+
+        let ts = self.transform.invert()?;
+        let ts = ts.post_concat(self.points_to_unit);
+        p.push_transform(ts);
+
+        push_stages_pre(p);
+
+        match self.tile_mode {
+            SpreadMode::Reflect => {
+                p.push(pipeline::Stage::ReflectX1);
+            }
+            SpreadMode::Repeat => {
+                p.push(pipeline::Stage::RepeatX1);
+            }
+            SpreadMode::Pad => {
+                if self.has_uniform_stops {
+                    // We clamp only when the stops are evenly spaced.
+                    // If not, there may be hard stops, and clamping ruins hard stops at 0 and/or 1.
+                    // In that case, we must make sure we're using the general "gradient" stage,
+                    // which is the only stage that will correctly handle unclamped t.
+                    p.push(pipeline::Stage::PadX1);
+                }
+            }
+        }
+
+        // The two-stop case with stops at 0 and 1.
+        if self.stops.len() == 2 {
+            debug_assert!(self.has_uniform_stops);
+
+            let c0 = self.stops[0].color;
+            let c1 = self.stops[1].color;
+
+            p.ctx.evenly_spaced_2_stop_gradient = EvenlySpaced2StopGradientCtx {
+                factor: GradientColor::new(
+                    c1.red() - c0.red(),
+                    c1.green() - c0.green(),
+                    c1.blue() - c0.blue(),
+                    c1.alpha() - c0.alpha(),
+                ),
+                bias: GradientColor::from(c0),
+            };
+
+            p.push(pipeline::Stage::EvenlySpaced2StopGradient);
+        } else {
+            // Unlike Skia, we do not support the `evenly_spaced_gradient` stage.
+            // In our case, there is no performance difference.
+
+            let mut ctx = GradientCtx::default();
+
+            // Note: In order to handle clamps in search, the search assumes
+            // a stop conceptually placed at -inf.
+            // Therefore, the max number of stops is `self.points.len()+1`.
+            //
+            // We also need at least 16 values for lowp pipeline.
+            ctx.factors.reserve((self.stops.len() + 1).max(16));
+            ctx.biases.reserve((self.stops.len() + 1).max(16));
+
+            ctx.t_values.reserve(self.stops.len() + 1);
+
+            // Remove the dummy stops inserted by Gradient::new
+            // because they are naturally handled by the search method.
+            let (first_stop, last_stop) = if self.stops.len() > 2 {
+                let first = if self.stops[0].color != self.stops[1].color {
+                    0
+                } else {
+                    1
+                };
+
+                let len = self.stops.len();
+                let last = if self.stops[len - 2].color != self.stops[len - 1].color {
+                    len - 1
+                } else {
+                    len - 2
+                };
+                (first, last)
+            } else {
+                (0, 1)
+            };
+
+            let mut t_l = self.stops[first_stop].position.get();
+            let mut c_l = GradientColor::from(self.stops[first_stop].color);
+            ctx.push_const_color(c_l);
+            ctx.t_values.push(NormalizedF32::ZERO);
+            // N.B. lastStop is the index of the last stop, not one after.
+            for i in first_stop..last_stop {
+                let t_r = self.stops[i + 1].position.get();
+                let c_r = GradientColor::from(self.stops[i + 1].color);
+                debug_assert!(t_l <= t_r);
+                if t_l < t_r {
+                    // For each stop we calculate a bias B and a scale factor F, such that
+                    // for any t between stops n and n+1, the color we want is B[n] + F[n]*t.
+                    let f = GradientColor::new(
+                        (c_r.r - c_l.r) / (t_r - t_l),
+                        (c_r.g - c_l.g) / (t_r - t_l),
+                        (c_r.b - c_l.b) / (t_r - t_l),
+                        (c_r.a - c_l.a) / (t_r - t_l),
+                    );
+                    ctx.factors.push(f);
+
+                    ctx.biases.push(GradientColor::new(
+                        c_l.r - f.r * t_l,
+                        c_l.g - f.g * t_l,
+                        c_l.b - f.b * t_l,
+                        c_l.a - f.a * t_l,
+                    ));
+
+                    ctx.t_values.push(NormalizedF32::new_clamped(t_l));
+                }
+
+                t_l = t_r;
+                c_l = c_r;
+            }
+
+            ctx.push_const_color(c_l);
+            ctx.t_values.push(NormalizedF32::new_clamped(t_l));
+
+            ctx.len = ctx.factors.len();
+
+            // All lists must have the same length.
+            debug_assert_eq!(ctx.factors.len(), ctx.t_values.len());
+            debug_assert_eq!(ctx.biases.len(), ctx.t_values.len());
+
+            // Will with zeros until we have enough data to fit into F32x16.
+            while ctx.factors.len() < 16 {
+                ctx.factors.push(GradientColor::default());
+                ctx.biases.push(GradientColor::default());
+            }
+
+            p.push(pipeline::Stage::Gradient);
+            p.ctx.gradient = ctx;
+        }
+
+        if !self.colors_are_opaque {
+            p.push(pipeline::Stage::Premultiply);
+        }
+
+        push_stages_post(p);
+
+        Some(())
+    }
+
+    pub fn apply_opacity(&mut self, opacity: f32) {
+        for stop in &mut self.stops {
+            stop.color.apply_opacity(opacity);
+        }
+
+        self.colors_are_opaque = self.stops.iter().all(|p| p.color.is_opaque());
+    }
+}
--- a/third-party/vendor/tiny-skia/src/shaders/linear_gradient.rs
+++ b/third-party/vendor/tiny-skia/src/shaders/linear_gradient.rs
@ -0,0 +1,178 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use alloc::vec::Vec;
+
+use tiny_skia_path::Scalar;
+
+use crate::{Color, GradientStop, Point, Shader, SpreadMode, Transform};
+
+use super::gradient::{Gradient, DEGENERATE_THRESHOLD};
+use crate::pipeline::RasterPipelineBuilder;
+
+/// A linear gradient shader.
+#[derive(Clone, PartialEq, Debug)]
+pub struct LinearGradient {
+    pub(crate) base: Gradient,
+}
+
+impl LinearGradient {
+    /// Creates a new linear gradient shader.
+    ///
+    /// Returns `Shader::SolidColor` when:
+    /// - `stops.len()` == 1
+    /// - `start` and `end` are very close
+    ///
+    /// Returns `None` when:
+    ///
+    /// - `stops` is empty
+    /// - `start` == `end`
+    /// - `transform` is not invertible
+    #[allow(clippy::new_ret_no_self)]
+    pub fn new(
+        start: Point,
+        end: Point,
+        stops: Vec<GradientStop>,
+        mode: SpreadMode,
+        transform: Transform,
+    ) -> Option<Shader<'static>> {
+        if stops.is_empty() {
+            return None;
+        }
+
+        if stops.len() == 1 {
+            return Some(Shader::SolidColor(stops[0].color));
+        }
+
+        let length = (end - start).length();
+        if !length.is_finite() {
+            return None;
+        }
+
+        if length.is_nearly_zero_within_tolerance(DEGENERATE_THRESHOLD) {
+            // Degenerate gradient, the only tricky complication is when in clamp mode,
+            // the limit of the gradient approaches two half planes of solid color
+            // (first and last). However, they are divided by the line perpendicular
+            // to the start and end point, which becomes undefined once start and end
+            // are exactly the same, so just use the end color for a stable solution.
+
+            // Except for special circumstances of clamped gradients,
+            // every gradient shape (when degenerate) can be mapped to the same fallbacks.
+            // The specific shape factories must account for special clamped conditions separately,
+            // this will always return the last color for clamped gradients.
+            match mode {
+                SpreadMode::Pad => {
+                    // Depending on how the gradient shape degenerates,
+                    // there may be a more specialized fallback representation
+                    // for the factories to use, but this is a reasonable default.
+                    return Some(Shader::SolidColor(stops.last().unwrap().color));
+                }
+                SpreadMode::Reflect | SpreadMode::Repeat => {
+                    // repeat and mirror are treated the same: the border colors are never visible,
+                    // but approximate the final color as infinite repetitions of the colors, so
+                    // it can be represented as the average color of the gradient.
+                    return Some(Shader::SolidColor(average_gradient_color(&stops)));
+                }
+            }
+        }
+
+        transform.invert()?;
+
+        let unit_ts = points_to_unit_ts(start, end)?;
+        Some(Shader::LinearGradient(LinearGradient {
+            base: Gradient::new(stops, mode, transform, unit_ts),
+        }))
+    }
+
+    pub(crate) fn is_opaque(&self) -> bool {
+        self.base.colors_are_opaque
+    }
+
+    pub(crate) fn push_stages(&self, p: &mut RasterPipelineBuilder) -> Option<()> {
+        self.base.push_stages(p, &|_| {}, &|_| {})
+    }
+}
+
+fn points_to_unit_ts(start: Point, end: Point) -> Option<Transform> {
+    let mut vec = end - start;
+    let mag = vec.length();
+    let inv = if mag != 0.0 { mag.invert() } else { 0.0 };
+
+    vec.scale(inv);
+
+    let mut ts = ts_from_sin_cos_at(-vec.y, vec.x, start.x, start.y);
+    ts = ts.post_translate(-start.x, -start.y);
+    ts = ts.post_scale(inv, inv);
+    Some(ts)
+}
+
+fn average_gradient_color(points: &[GradientStop]) -> Color {
+    use crate::wide::f32x4;
+
+    fn load_color(c: Color) -> f32x4 {
+        f32x4::from([c.red(), c.green(), c.blue(), c.alpha()])
+    }
+
+    fn store_color(c: f32x4) -> Color {
+        let c: [f32; 4] = c.into();
+        Color::from_rgba(c[0], c[1], c[2], c[3]).unwrap()
+    }
+
+    assert!(!points.is_empty());
+
+    // The gradient is a piecewise linear interpolation between colors. For a given interval,
+    // the integral between the two endpoints is 0.5 * (ci + cj) * (pj - pi), which provides that
+    // intervals average color. The overall average color is thus the sum of each piece. The thing
+    // to keep in mind is that the provided gradient definition may implicitly use p=0 and p=1.
+    let mut blend = f32x4::splat(0.0);
+
+    // Bake 1/(colorCount - 1) uniform stop difference into this scale factor
+    let w_scale = f32x4::splat(0.5);
+
+    for i in 0..points.len() - 1 {
+        // Calculate the average color for the interval between pos(i) and pos(i+1)
+        let c0 = load_color(points[i].color);
+        let c1 = load_color(points[i + 1].color);
+        // when pos == null, there are colorCount uniformly distributed stops, going from 0 to 1,
+        // so pos[i + 1] - pos[i] = 1/(colorCount-1)
+        let w = points[i + 1].position.get() - points[i].position.get();
+        blend += w_scale * f32x4::splat(w) * (c1 + c0);
+    }
+
+    // Now account for any implicit intervals at the start or end of the stop definitions
+    if points[0].position.get() > 0.0 {
+        // The first color is fixed between p = 0 to pos[0], so 0.5 * (ci + cj) * (pj - pi)
+        // becomes 0.5 * (c + c) * (pj - 0) = c * pj
+        let c = load_color(points[0].color);
+        blend += f32x4::splat(points[0].position.get()) * c;
+    }
+
+    let last_idx = points.len() - 1;
+    if points[last_idx].position.get() < 1.0 {
+        // The last color is fixed between pos[n-1] to p = 1, so 0.5 * (ci + cj) * (pj - pi)
+        // becomes 0.5 * (c + c) * (1 - pi) = c * (1 - pi)
+        let c = load_color(points[last_idx].color);
+        blend += (f32x4::splat(1.0) - f32x4::splat(points[last_idx].position.get())) * c;
+    }
+
+    store_color(blend)
+}
+
+fn ts_from_sin_cos_at(sin: f32, cos: f32, px: f32, py: f32) -> Transform {
+    let cos_inv = 1.0 - cos;
+    Transform::from_row(
+        cos,
+        sin,
+        -sin,
+        cos,
+        sdot(sin, py, cos_inv, px),
+        sdot(-sin, px, cos_inv, py),
+    )
+}
+
+fn sdot(a: f32, b: f32, c: f32, d: f32) -> f32 {
+    a * b + c * d
+}
--- a/third-party/vendor/tiny-skia/src/shaders/mod.rs
+++ b/third-party/vendor/tiny-skia/src/shaders/mod.rs
@ -0,0 +1,135 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+mod gradient;
+mod linear_gradient;
+mod pattern;
+mod radial_gradient;
+
+use tiny_skia_path::{NormalizedF32, Scalar};
+
+pub use gradient::GradientStop;
+pub use linear_gradient::LinearGradient;
+pub use pattern::{FilterQuality, Pattern, PixmapPaint};
+pub use radial_gradient::RadialGradient;
+
+use crate::{Color, Transform};
+
+use crate::pipeline::RasterPipelineBuilder;
+
+/// A shader spreading mode.
+#[derive(Copy, Clone, PartialEq, Debug)]
+pub enum SpreadMode {
+    /// Replicate the edge color if the shader draws outside of its
+    /// original bounds.
+    Pad,
+
+    /// Repeat the shader's image horizontally and vertically, alternating
+    /// mirror images so that adjacent images always seam.
+    Reflect,
+
+    /// Repeat the shader's image horizontally and vertically.
+    Repeat,
+}
+
+impl Default for SpreadMode {
+    fn default() -> Self {
+        SpreadMode::Pad
+    }
+}
+
+/// A shader specifies the source color(s) for what is being drawn.
+///
+/// If a paint has no shader, then the paint's color is used. If the paint has a
+/// shader, then the shader's color(s) are use instead, but they are
+/// modulated by the paint's alpha. This makes it easy to create a shader
+/// once (e.g. bitmap tiling or gradient) and then change its transparency
+/// without having to modify the original shader. Only the paint's alpha needs
+/// to be modified.
+#[derive(Clone, PartialEq, Debug)]
+pub enum Shader<'a> {
+    /// A solid color shader.
+    SolidColor(Color),
+    /// A linear gradient shader.
+    LinearGradient(LinearGradient),
+    /// A radial gradient shader.
+    RadialGradient(RadialGradient),
+    /// A pattern shader.
+    Pattern(Pattern<'a>),
+}
+
+impl<'a> Shader<'a> {
+    /// Checks if the shader is guaranteed to produce only opaque colors.
+    pub fn is_opaque(&self) -> bool {
+        match self {
+            Shader::SolidColor(ref c) => c.is_opaque(),
+            Shader::LinearGradient(ref g) => g.is_opaque(),
+            Shader::RadialGradient(_) => false,
+            Shader::Pattern(_) => false,
+        }
+    }
+
+    // Unlike Skia, we do not have is_constant, because we don't have Color shaders.
+
+    /// If this returns false, then we draw nothing (do not fall back to shader context)
+    pub(crate) fn push_stages(&self, p: &mut RasterPipelineBuilder) -> Option<()> {
+        match self {
+            Shader::SolidColor(color) => {
+                p.push_uniform_color(color.premultiply());
+                Some(())
+            }
+            Shader::LinearGradient(ref g) => g.push_stages(p),
+            Shader::RadialGradient(ref g) => g.push_stages(p),
+            Shader::Pattern(ref patt) => patt.push_stages(p),
+        }
+    }
+
+    /// Transforms the shader.
+    pub fn transform(&mut self, ts: Transform) {
+        match self {
+            Shader::SolidColor(_) => {}
+            Shader::LinearGradient(g) => {
+                g.base.transform = g.base.transform.post_concat(ts);
+            }
+            Shader::RadialGradient(g) => {
+                g.base.transform = g.base.transform.post_concat(ts);
+            }
+            Shader::Pattern(p) => {
+                p.transform = p.transform.post_concat(ts);
+            }
+        }
+    }
+
+    /// Shifts shader's opacity.
+    ///
+    /// `opacity` will be clamped to the 0..=1 range.
+    ///
+    /// This is roughly the same as Skia's `SkPaint::setAlpha`.
+    ///
+    /// Unlike Skia, we do not support global alpha/opacity, which is in Skia
+    /// is set via the alpha channel of the `SkPaint::fColor4f`.
+    /// Instead, you can shift the opacity of the shader to whatever value you need.
+    ///
+    /// - For `SolidColor` this function will multiply `color.alpha` by `opacity`.
+    /// - For gradients this function will multiply all colors by `opacity`.
+    /// - For `Pattern` this function will multiply `Patter::opacity` by `opacity`.
+    pub fn apply_opacity(&mut self, opacity: f32) {
+        match self {
+            Shader::SolidColor(ref mut c) => {
+                c.apply_opacity(opacity);
+            }
+            Shader::LinearGradient(g) => {
+                g.base.apply_opacity(opacity);
+            }
+            Shader::RadialGradient(g) => {
+                g.base.apply_opacity(opacity);
+            }
+            Shader::Pattern(ref mut p) => {
+                p.opacity = NormalizedF32::new(p.opacity.get() * opacity.bound(0.0, 1.0)).unwrap();
+            }
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/src/shaders/pattern.rs
+++ b/third-party/vendor/tiny-skia/src/shaders/pattern.rs
@ -0,0 +1,176 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use tiny_skia_path::NormalizedF32;
+
+use crate::{BlendMode, PixmapRef, Shader, SpreadMode, Transform};
+
+use crate::pipeline;
+use crate::pipeline::RasterPipelineBuilder;
+
+#[cfg(all(not(feature = "std"), feature = "no-std-float"))]
+use tiny_skia_path::NoStdFloat;
+
+/// Controls how much filtering to be done when transforming images.
+#[derive(Copy, Clone, PartialEq, Debug)]
+pub enum FilterQuality {
+    /// Nearest-neighbor. Low quality, but fastest.
+    Nearest,
+    /// Bilinear.
+    Bilinear,
+    /// Bicubic. High quality, but slow.
+    Bicubic,
+}
+
+/// Controls how a pixmap should be blended.
+///
+/// Like `Paint`, but for `Pixmap`.
+#[derive(Copy, Clone, PartialEq, Debug)]
+pub struct PixmapPaint {
+    /// Pixmap opacity.
+    ///
+    /// Must be in 0..=1 range.
+    ///
+    /// Default: 1.0
+    pub opacity: f32,
+
+    /// Pixmap blending mode.
+    ///
+    /// Default: SourceOver
+    pub blend_mode: BlendMode,
+
+    /// Specifies how much filtering to be done when transforming images.
+    ///
+    /// Default: Nearest
+    pub quality: FilterQuality,
+}
+
+impl Default for PixmapPaint {
+    fn default() -> Self {
+        PixmapPaint {
+            opacity: 1.0,
+            blend_mode: BlendMode::default(),
+            quality: FilterQuality::Nearest,
+        }
+    }
+}
+
+/// A pattern shader.
+///
+/// Essentially a `SkImageShader`.
+///
+/// Unlike Skia, we do not support FilterQuality::Medium, because it involves
+/// mipmap generation, which adds too much complexity.
+#[derive(Clone, PartialEq, Debug)]
+pub struct Pattern<'a> {
+    pub(crate) pixmap: PixmapRef<'a>,
+    quality: FilterQuality,
+    spread_mode: SpreadMode,
+    pub(crate) opacity: NormalizedF32,
+    pub(crate) transform: Transform,
+}
+
+impl<'a> Pattern<'a> {
+    /// Creates a new pattern shader.
+    ///
+    /// `opacity` will be clamped to the 0..=1 range.
+    #[allow(clippy::new_ret_no_self)]
+    pub fn new(
+        pixmap: PixmapRef<'a>,
+        spread_mode: SpreadMode,
+        quality: FilterQuality,
+        opacity: f32,
+        transform: Transform,
+    ) -> Shader {
+        Shader::Pattern(Pattern {
+            pixmap,
+            spread_mode,
+            quality,
+            opacity: NormalizedF32::new_clamped(opacity),
+            transform,
+        })
+    }
+
+    pub(crate) fn push_stages(&self, p: &mut RasterPipelineBuilder) -> Option<()> {
+        let ts = self.transform.invert()?;
+
+        p.push(pipeline::Stage::SeedShader);
+
+        p.push_transform(ts);
+
+        let mut quality = self.quality;
+
+        if ts.is_identity() || ts.is_translate() {
+            quality = FilterQuality::Nearest;
+        }
+
+        if quality == FilterQuality::Bilinear {
+            if ts.is_translate() {
+                if ts.tx == ts.tx.trunc() && ts.ty == ts.ty.trunc() {
+                    // When the matrix is just an integer translate, bilerp == nearest neighbor.
+                    quality = FilterQuality::Nearest;
+                }
+            }
+        }
+
+        // TODO: minimizing scale via mipmap
+
+        match quality {
+            FilterQuality::Nearest => {
+                p.ctx.limit_x = pipeline::TileCtx {
+                    scale: self.pixmap.width() as f32,
+                    inv_scale: 1.0 / self.pixmap.width() as f32,
+                };
+
+                p.ctx.limit_y = pipeline::TileCtx {
+                    scale: self.pixmap.height() as f32,
+                    inv_scale: 1.0 / self.pixmap.height() as f32,
+                };
+
+                match self.spread_mode {
+                    SpreadMode::Pad => { /* The gather() stage will clamp for us. */ }
+                    SpreadMode::Repeat => p.push(pipeline::Stage::Repeat),
+                    SpreadMode::Reflect => p.push(pipeline::Stage::Reflect),
+                }
+
+                p.push(pipeline::Stage::Gather);
+            }
+            FilterQuality::Bilinear => {
+                p.ctx.sampler = pipeline::SamplerCtx {
+                    spread_mode: self.spread_mode,
+                    inv_width: 1.0 / self.pixmap.width() as f32,
+                    inv_height: 1.0 / self.pixmap.height() as f32,
+                };
+                p.push(pipeline::Stage::Bilinear);
+            }
+            FilterQuality::Bicubic => {
+                p.ctx.sampler = pipeline::SamplerCtx {
+                    spread_mode: self.spread_mode,
+                    inv_width: 1.0 / self.pixmap.width() as f32,
+                    inv_height: 1.0 / self.pixmap.height() as f32,
+                };
+                p.push(pipeline::Stage::Bicubic);
+
+                // Bicubic filtering naturally produces out of range values on both sides of [0,1].
+                p.push(pipeline::Stage::Clamp0);
+                p.push(pipeline::Stage::ClampA);
+            }
+        }
+
+        // Unlike Skia, we do not support global opacity and only Pattern allows it.
+        if self.opacity != NormalizedF32::ONE {
+            debug_assert_eq!(
+                core::mem::size_of_val(&self.opacity),
+                4,
+                "alpha must be f32"
+            );
+            p.ctx.current_coverage = self.opacity.get();
+            p.push(pipeline::Stage::Scale1Float);
+        }
+
+        Some(())
+    }
+}
--- a/third-party/vendor/tiny-skia/src/shaders/radial_gradient.rs
+++ b/third-party/vendor/tiny-skia/src/shaders/radial_gradient.rs
@ -0,0 +1,197 @@
+// Copyright 2006 The Android Open Source Project
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use alloc::vec::Vec;
+
+use tiny_skia_path::Scalar;
+
+use crate::{GradientStop, Point, Shader, SpreadMode, Transform};
+
+use super::gradient::{Gradient, DEGENERATE_THRESHOLD};
+use crate::pipeline;
+use crate::pipeline::RasterPipelineBuilder;
+use crate::wide::u32x8;
+
+#[cfg(all(not(feature = "std"), feature = "no-std-float"))]
+use tiny_skia_path::NoStdFloat;
+
+#[derive(Copy, Clone, PartialEq, Debug)]
+struct FocalData {
+    r1: f32, // r1 after mapping focal point to (0, 0)
+}
+
+impl FocalData {
+    // Whether the focal point (0, 0) is on the end circle with center (1, 0) and radius r1. If
+    // this is true, it's as if an aircraft is flying at Mach 1 and all circles (soundwaves)
+    // will go through the focal point (aircraft). In our previous implementations, this was
+    // known as the edge case where the inside circle touches the outside circle (on the focal
+    // point). If we were to solve for t bruteforcely using a quadratic equation, this case
+    // implies that the quadratic equation degenerates to a linear equation.
+    fn is_focal_on_circle(&self) -> bool {
+        (1.0 - self.r1).is_nearly_zero()
+    }
+
+    fn is_well_behaved(&self) -> bool {
+        !self.is_focal_on_circle() && self.r1 > 1.0
+    }
+}
+
+/// A radial gradient shader.
+///
+/// This is not `SkRadialGradient` like in Skia, but rather `SkTwoPointConicalGradient`
+/// without the start radius.
+#[derive(Clone, PartialEq, Debug)]
+pub struct RadialGradient {
+    pub(crate) base: Gradient,
+    focal_data: Option<FocalData>,
+}
+
+impl RadialGradient {
+    /// Creates a new radial gradient shader.
+    ///
+    /// Returns `Shader::SolidColor` when:
+    /// - `stops.len()` == 1
+    ///
+    /// Returns `None` when:
+    ///
+    /// - `stops` is empty
+    /// - `radius` <= 0
+    /// - `transform` is not invertible
+    #[allow(clippy::new_ret_no_self)]
+    pub fn new(
+        start: Point,
+        end: Point,
+        radius: f32,
+        stops: Vec<GradientStop>,
+        mode: SpreadMode,
+        transform: Transform,
+    ) -> Option<Shader<'static>> {
+        // From SkGradientShader::MakeTwoPointConical
+
+        if radius < 0.0 || radius.is_nearly_zero() {
+            return None;
+        }
+
+        if stops.is_empty() {
+            return None;
+        }
+
+        if stops.len() == 1 {
+            return Some(Shader::SolidColor(stops[0].color));
+        }
+
+        transform.invert()?;
+
+        let length = (end - start).length();
+        if !length.is_finite() {
+            return None;
+        }
+
+        if length.is_nearly_zero_within_tolerance(DEGENERATE_THRESHOLD) {
+            // If the center positions are the same, then the gradient
+            // is the radial variant of a 2 pt conical gradient,
+            // an actual radial gradient (startRadius == 0),
+            // or it is fully degenerate (startRadius == endRadius).
+
+            let inv = radius.invert();
+            let mut ts = Transform::from_translate(-start.x, -start.y);
+            ts = ts.post_scale(inv, inv);
+
+            // We can treat this gradient as radial, which is faster. If we got here, we know
+            // that endRadius is not equal to 0, so this produces a meaningful gradient
+            Some(Shader::RadialGradient(RadialGradient {
+                base: Gradient::new(stops, mode, transform, ts),
+                focal_data: None,
+            }))
+        } else {
+            // From SkTwoPointConicalGradient::Create
+            let mut ts = ts_from_poly_to_poly(
+                start,
+                end,
+                Point::from_xy(0.0, 0.0),
+                Point::from_xy(1.0, 0.0),
+            )?;
+
+            let d_center = (start - end).length();
+            let r1 = radius / d_center;
+            let focal_data = FocalData { r1 };
+
+            // The following transformations are just to accelerate the shader computation by saving
+            // some arithmetic operations.
+            if focal_data.is_focal_on_circle() {
+                ts = ts.post_scale(0.5, 0.5);
+            } else {
+                ts = ts.post_scale(r1 / (r1 * r1 - 1.0), 1.0 / ((r1 * r1 - 1.0).abs()).sqrt());
+            }
+
+            Some(Shader::RadialGradient(RadialGradient {
+                base: Gradient::new(stops, mode, transform, ts),
+                focal_data: Some(focal_data),
+            }))
+        }
+    }
+
+    pub(crate) fn push_stages(&self, p: &mut RasterPipelineBuilder) -> Option<()> {
+        let p0 = if let Some(focal_data) = self.focal_data {
+            1.0 / focal_data.r1
+        } else {
+            1.0
+        };
+
+        p.ctx.two_point_conical_gradient = pipeline::TwoPointConicalGradientCtx {
+            mask: u32x8::default(),
+            p0,
+        };
+
+        self.base.push_stages(
+            p,
+            &|p| {
+                if let Some(focal_data) = self.focal_data {
+                    // Unlike Skia, we have only the Focal radial gradient type.
+
+                    if focal_data.is_focal_on_circle() {
+                        p.push(pipeline::Stage::XYTo2PtConicalFocalOnCircle);
+                    } else if focal_data.is_well_behaved() {
+                        p.push(pipeline::Stage::XYTo2PtConicalWellBehaved);
+                    } else {
+                        p.push(pipeline::Stage::XYTo2PtConicalGreater);
+                    }
+
+                    if !focal_data.is_well_behaved() {
+                        p.push(pipeline::Stage::Mask2PtConicalDegenerates);
+                    }
+                } else {
+                    p.push(pipeline::Stage::XYToRadius);
+                }
+            },
+            &|p| {
+                if let Some(focal_data) = self.focal_data {
+                    if !focal_data.is_well_behaved() {
+                        p.push(pipeline::Stage::ApplyVectorMask);
+                    }
+                }
+            },
+        )
+    }
+}
+
+fn ts_from_poly_to_poly(src1: Point, src2: Point, dst1: Point, dst2: Point) -> Option<Transform> {
+    let tmp = from_poly2(src1, src2);
+    let res = tmp.invert()?;
+    let tmp = from_poly2(dst1, dst2);
+    Some(tmp.pre_concat(res))
+}
+
+fn from_poly2(p0: Point, p1: Point) -> Transform {
+    Transform::from_row(
+        p1.y - p0.y,
+        p0.x - p1.x,
+        p1.x - p0.x,
+        p1.y - p0.y,
+        p0.x,
+        p0.y,
+    )
+}
--- a/third-party/vendor/tiny-skia/src/wide/f32x16_t.rs
+++ b/third-party/vendor/tiny-skia/src/wide/f32x16_t.rs
@ -0,0 +1,138 @@
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+use super::{f32x8, u16x16};
+
+#[derive(Copy, Clone, Debug)]
+#[repr(C, align(32))]
+pub struct f32x16(pub f32x8, pub f32x8);
+
+unsafe impl bytemuck::Zeroable for f32x16 {}
+unsafe impl bytemuck::Pod for f32x16 {}
+
+impl Default for f32x16 {
+    fn default() -> Self {
+        Self::splat(0.0)
+    }
+}
+
+impl f32x16 {
+    pub fn splat(n: f32) -> Self {
+        Self(f32x8::splat(n), f32x8::splat(n))
+    }
+
+    #[inline]
+    pub fn abs(&self) -> Self {
+        // Yes, Skia does it in the same way.
+        let abs = |x| bytemuck::cast::<i32, f32>(bytemuck::cast::<f32, i32>(x) & 0x7fffffff);
+
+        let n0: [f32; 8] = self.0.into();
+        let n1: [f32; 8] = self.1.into();
+        Self(
+            f32x8::from([
+                abs(n0[0]),
+                abs(n0[1]),
+                abs(n0[2]),
+                abs(n0[3]),
+                abs(n0[4]),
+                abs(n0[5]),
+                abs(n0[6]),
+                abs(n0[7]),
+            ]),
+            f32x8::from([
+                abs(n1[0]),
+                abs(n1[1]),
+                abs(n1[2]),
+                abs(n1[3]),
+                abs(n1[4]),
+                abs(n1[5]),
+                abs(n1[6]),
+                abs(n1[7]),
+            ]),
+        )
+    }
+
+    pub fn cmp_gt(self, rhs: &Self) -> Self {
+        Self(self.0.cmp_gt(rhs.0), self.1.cmp_gt(rhs.1))
+    }
+
+    pub fn blend(self, t: Self, f: Self) -> Self {
+        Self(self.0.blend(t.0, f.0), self.1.blend(t.1, f.1))
+    }
+
+    pub fn normalize(&self) -> Self {
+        Self(self.0.normalize(), self.1.normalize())
+    }
+
+    pub fn floor(&self) -> Self {
+        // Yes, Skia does it in the same way.
+        let roundtrip = self.round();
+        roundtrip
+            - roundtrip
+                .cmp_gt(self)
+                .blend(f32x16::splat(1.0), f32x16::splat(0.0))
+    }
+
+    pub fn sqrt(&self) -> Self {
+        Self(self.0.sqrt(), self.1.sqrt())
+    }
+
+    pub fn round(&self) -> Self {
+        Self(self.0.round(), self.1.round())
+    }
+
+    // This method is too heavy and shouldn't be inlined.
+    pub fn save_to_u16x16(&self, dst: &mut u16x16) {
+        // Do not use to_i32x8, because it involves rounding,
+        // and Skia cast's without it.
+
+        let n0: [f32; 8] = self.0.into();
+        let n1: [f32; 8] = self.1.into();
+
+        dst.0[0] = n0[0] as u16;
+        dst.0[1] = n0[1] as u16;
+        dst.0[2] = n0[2] as u16;
+        dst.0[3] = n0[3] as u16;
+
+        dst.0[4] = n0[4] as u16;
+        dst.0[5] = n0[5] as u16;
+        dst.0[6] = n0[6] as u16;
+        dst.0[7] = n0[7] as u16;
+
+        dst.0[8] = n1[0] as u16;
+        dst.0[9] = n1[1] as u16;
+        dst.0[10] = n1[2] as u16;
+        dst.0[11] = n1[3] as u16;
+
+        dst.0[12] = n1[4] as u16;
+        dst.0[13] = n1[5] as u16;
+        dst.0[14] = n1[6] as u16;
+        dst.0[15] = n1[7] as u16;
+    }
+}
+
+impl core::ops::Add<f32x16> for f32x16 {
+    type Output = Self;
+
+    fn add(self, rhs: Self) -> Self::Output {
+        Self(self.0 + rhs.0, self.1 + rhs.1)
+    }
+}
+
+impl core::ops::Sub<f32x16> for f32x16 {
+    type Output = Self;
+
+    fn sub(self, rhs: Self) -> Self::Output {
+        Self(self.0 - rhs.0, self.1 - rhs.1)
+    }
+}
+
+impl core::ops::Mul<f32x16> for f32x16 {
+    type Output = Self;
+
+    fn mul(self, rhs: Self) -> Self::Output {
+        Self(self.0 * rhs.0, self.1 * rhs.1)
+    }
+}
--- a/third-party/vendor/tiny-skia/src/wide/f32x4_t.rs
+++ b/third-party/vendor/tiny-skia/src/wide/f32x4_t.rs
@ -0,0 +1,640 @@
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+// Based on https://github.com/Lokathor/wide (Zlib)
+
+use bytemuck::cast;
+
+#[cfg(all(not(feature = "std"), feature = "no-std-float"))]
+use tiny_skia_path::NoStdFloat;
+
+use super::i32x4;
+
+cfg_if::cfg_if! {
+    if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+        #[cfg(target_arch = "x86")]
+        use core::arch::x86::*;
+        #[cfg(target_arch = "x86_64")]
+        use core::arch::x86_64::*;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(16))]
+        pub struct f32x4(__m128);
+    } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+        use core::arch::wasm32::*;
+
+        // repr(transparent) allows for directly passing the v128 on the WASM stack.
+        #[derive(Clone, Copy, Debug)]
+        #[repr(transparent)]
+        pub struct f32x4(v128);
+    } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+        use core::arch::aarch64::*;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(16))]
+        pub struct f32x4(float32x4_t);
+    } else {
+        use super::FasterMinMax;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(16))]
+        pub struct f32x4([f32; 4]);
+    }
+}
+
+unsafe impl bytemuck::Zeroable for f32x4 {}
+unsafe impl bytemuck::Pod for f32x4 {}
+
+impl Default for f32x4 {
+    fn default() -> Self {
+        Self::splat(0.0)
+    }
+}
+
+impl f32x4 {
+    pub fn splat(n: f32) -> Self {
+        Self::from([n, n, n, n])
+    }
+
+    pub fn floor(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_floor(self.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vrndmq_f32(self.0) })
+            } else {
+                let roundtrip: f32x4 = cast(self.trunc_int().to_f32x4());
+                roundtrip - roundtrip.cmp_gt(self).blend(f32x4::splat(1.0), f32x4::default())
+            }
+        }
+    }
+
+    pub fn abs(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_abs(self.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vabsq_f32(self.0) })
+            } else {
+                let non_sign_bits = f32x4::splat(f32::from_bits(i32::MAX as u32));
+                self & non_sign_bits
+            }
+        }
+    }
+
+    pub fn max(self, rhs: Self) -> Self {
+        // These technically don't have the same semantics for NaN and 0, but it
+        // doesn't seem to matter as Skia does it the same way.
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_max_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_pmax(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vmaxq_f32(self.0, rhs.0) })
+            } else {
+                Self([
+                    self.0[0].faster_max(rhs.0[0]),
+                    self.0[1].faster_max(rhs.0[1]),
+                    self.0[2].faster_max(rhs.0[2]),
+                    self.0[3].faster_max(rhs.0[3]),
+                ])
+            }
+        }
+    }
+
+    pub fn min(self, rhs: Self) -> Self {
+        // These technically don't have the same semantics for NaN and 0, but it
+        // doesn't seem to matter as Skia does it the same way.
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_min_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_pmin(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vminq_f32(self.0, rhs.0) })
+            } else {
+                Self([
+                    self.0[0].faster_min(rhs.0[0]),
+                    self.0[1].faster_min(rhs.0[1]),
+                    self.0[2].faster_min(rhs.0[2]),
+                    self.0[3].faster_min(rhs.0[3]),
+                ])
+            }
+        }
+    }
+
+    pub fn cmp_eq(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_cmpeq_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_eq(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(cast(unsafe { vceqq_f32(self.0, rhs.0) }))
+            } else {
+                Self([
+                    if self.0[0] == rhs.0[0] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[1] == rhs.0[1] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[2] == rhs.0[2] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[3] == rhs.0[3] { f32::from_bits(u32::MAX) } else { 0.0 },
+                ])
+            }
+        }
+    }
+
+    pub fn cmp_ne(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_cmpneq_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_ne(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(cast(unsafe { vmvnq_u32(vceqq_f32(self.0, rhs.0)) }))
+            } else {
+                Self([
+                    if self.0[0] != rhs.0[0] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[1] != rhs.0[1] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[2] != rhs.0[2] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[3] != rhs.0[3] { f32::from_bits(u32::MAX) } else { 0.0 },
+                ])
+            }
+        }
+    }
+
+    pub fn cmp_ge(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_cmpge_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_ge(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(cast(unsafe { vcgeq_f32(self.0, rhs.0) }))
+            } else {
+                Self([
+                    if self.0[0] >= rhs.0[0] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[1] >= rhs.0[1] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[2] >= rhs.0[2] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[3] >= rhs.0[3] { f32::from_bits(u32::MAX) } else { 0.0 },
+                ])
+            }
+        }
+    }
+
+    pub fn cmp_gt(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_cmpgt_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_gt(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(cast(unsafe { vcgtq_f32(self.0, rhs.0) }))
+            } else {
+                Self([
+                    if self.0[0] > rhs.0[0] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[1] > rhs.0[1] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[2] > rhs.0[2] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[3] > rhs.0[3] { f32::from_bits(u32::MAX) } else { 0.0 },
+                ])
+            }
+        }
+    }
+
+    pub fn cmp_le(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_cmple_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_le(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(cast(unsafe { vcleq_f32(self.0, rhs.0) }))
+            } else {
+                Self([
+                    if self.0[0] <= rhs.0[0] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[1] <= rhs.0[1] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[2] <= rhs.0[2] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[3] <= rhs.0[3] { f32::from_bits(u32::MAX) } else { 0.0 },
+                ])
+            }
+        }
+    }
+
+    pub fn cmp_lt(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_cmplt_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_lt(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(cast(unsafe { vcltq_f32(self.0, rhs.0) }))
+            } else {
+                Self([
+                    if self.0[0] < rhs.0[0] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[1] < rhs.0[1] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[2] < rhs.0[2] { f32::from_bits(u32::MAX) } else { 0.0 },
+                    if self.0[3] < rhs.0[3] { f32::from_bits(u32::MAX) } else { 0.0 },
+                ])
+            }
+        }
+    }
+
+    #[inline]
+    pub fn blend(self, t: Self, f: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse4.1"))] {
+                Self(unsafe { _mm_blendv_ps(f.0, t.0, self.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(v128_bitselect(t.0, f.0, self.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { cast(vbslq_u32( cast(self.0), cast(t.0), cast(f.0))) })
+            } else {
+                super::generic_bit_blend(self, t, f)
+            }
+        }
+    }
+
+    pub fn round(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse4.1"))] {
+                Self(
+                    unsafe { _mm_round_ps(self.0, _MM_FROUND_NO_EXC | _MM_FROUND_TO_NEAREST_INT) },
+                )
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_nearest(self.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vrndnq_f32(self.0) })
+            } else {
+                use super::u32x4;
+
+                let to_int = f32x4::splat(1.0 / f32::EPSILON);
+                let u: u32x4 = cast(self);
+                let e: i32x4 = cast(u.shr::<23>() & u32x4::splat(0xff));
+                let mut y: f32x4;
+
+                let no_op_magic = i32x4::splat(0x7f + 23);
+                let no_op_mask: f32x4 = cast(e.cmp_gt(no_op_magic) | e.cmp_eq(no_op_magic));
+                let no_op_val: f32x4 = self;
+
+                let zero_magic = i32x4::splat(0x7f - 1);
+                let zero_mask: f32x4 = cast(e.cmp_lt(zero_magic));
+                let zero_val: f32x4 = self * f32x4::splat(0.0);
+
+                let neg_bit: f32x4 = cast(cast::<u32x4, i32x4>(u).cmp_lt(i32x4::default()));
+                let x: f32x4 = neg_bit.blend(-self, self);
+                y = x + to_int - to_int - x;
+                y = y.cmp_gt(f32x4::splat(0.5)).blend(
+                    y + x - f32x4::splat(-1.0),
+                    y.cmp_lt(f32x4::splat(-0.5)).blend(y + x + f32x4::splat(1.0), y + x),
+                );
+                y = neg_bit.blend(-y, y);
+
+                no_op_mask.blend(no_op_val, zero_mask.blend(zero_val, y))
+            }
+        }
+    }
+
+    pub fn round_int(self) -> i32x4 {
+        // These technically don't have the same semantics for NaN and out of
+        // range values, but it doesn't seem to matter as Skia does it the same
+        // way.
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                i32x4(unsafe { _mm_cvtps_epi32(self.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                i32x4(i32x4_trunc_sat_f32x4(self.round().0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                i32x4(unsafe { vcvtnq_s32_f32(self.0) } )
+            } else {
+                let rounded: [f32; 4] = cast(self.round());
+                cast([
+                    rounded[0] as i32,
+                    rounded[1] as i32,
+                    rounded[2] as i32,
+                    rounded[3] as i32,
+                ])
+            }
+        }
+    }
+
+    pub fn trunc_int(self) -> i32x4 {
+        // These technically don't have the same semantics for NaN and out of
+        // range values, but it doesn't seem to matter as Skia does it the same
+        // way.
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                i32x4(unsafe { _mm_cvttps_epi32(self.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                i32x4(i32x4_trunc_sat_f32x4(self.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                i32x4(unsafe { vcvtq_s32_f32(self.0) })
+            } else {
+                cast([
+                    self.0[0] as i32,
+                    self.0[1] as i32,
+                    self.0[2] as i32,
+                    self.0[3] as i32,
+                ])
+            }
+        }
+    }
+
+    pub fn recip_fast(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_rcp_ps(self.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_div(f32x4_splat(1.0), self.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                unsafe {
+                    let a = vrecpeq_f32(self.0);
+                    let a = vmulq_f32(vrecpsq_f32(self.0, a), a);
+                    Self(a)
+                }
+            } else {
+                Self::from([
+                    1.0 / self.0[0],
+                    1.0 / self.0[1],
+                    1.0 / self.0[2],
+                    1.0 / self.0[3],
+                ])
+            }
+        }
+    }
+
+    pub fn recip_sqrt(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_rsqrt_ps(self.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_div(f32x4_splat(1.0), f32x4_sqrt(self.0)))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                unsafe {
+                    let a = vrsqrteq_f32(self.0);
+                    let a = vmulq_f32(vrsqrtsq_f32(self.0, vmulq_f32(a, a)), a);
+                    Self(a)
+                }
+            } else {
+                Self::from([
+                    1.0 / self.0[0].sqrt(),
+                    1.0 / self.0[1].sqrt(),
+                    1.0 / self.0[2].sqrt(),
+                    1.0 / self.0[3].sqrt(),
+                ])
+            }
+        }
+    }
+
+    pub fn sqrt(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_sqrt_ps(self.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_sqrt(self.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vsqrtq_f32(self.0) })
+            } else {
+                Self::from([
+                    self.0[0].sqrt(),
+                    self.0[1].sqrt(),
+                    self.0[2].sqrt(),
+                    self.0[3].sqrt(),
+                ])
+            }
+        }
+    }
+}
+
+impl From<[f32; 4]> for f32x4 {
+    fn from(v: [f32; 4]) -> Self {
+        cast(v)
+    }
+}
+
+impl From<f32x4> for [f32; 4] {
+    fn from(v: f32x4) -> Self {
+        cast(v)
+    }
+}
+
+impl core::ops::Add for f32x4 {
+    type Output = Self;
+
+    fn add(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_add_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_add(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vaddq_f32(self.0, rhs.0) })
+            } else {
+                Self([
+                    self.0[0] + rhs.0[0],
+                    self.0[1] + rhs.0[1],
+                    self.0[2] + rhs.0[2],
+                    self.0[3] + rhs.0[3],
+                ])
+            }
+        }
+    }
+}
+
+impl core::ops::AddAssign for f32x4 {
+    fn add_assign(&mut self, rhs: f32x4) {
+        *self = *self + rhs;
+    }
+}
+
+impl core::ops::Sub for f32x4 {
+    type Output = Self;
+
+    fn sub(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_sub_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_sub(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vsubq_f32(self.0, rhs.0) })
+            } else {
+                Self([
+                    self.0[0] - rhs.0[0],
+                    self.0[1] - rhs.0[1],
+                    self.0[2] - rhs.0[2],
+                    self.0[3] - rhs.0[3],
+                ])
+            }
+        }
+    }
+}
+
+impl core::ops::Mul for f32x4 {
+    type Output = Self;
+
+    fn mul(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_mul_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_mul(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vmulq_f32(self.0, rhs.0) })
+            } else {
+                Self([
+                    self.0[0] * rhs.0[0],
+                    self.0[1] * rhs.0[1],
+                    self.0[2] * rhs.0[2],
+                    self.0[3] * rhs.0[3],
+                ])
+            }
+        }
+    }
+}
+
+impl core::ops::MulAssign for f32x4 {
+    fn mul_assign(&mut self, rhs: f32x4) {
+        *self = *self * rhs;
+    }
+}
+
+impl core::ops::Div for f32x4 {
+    type Output = Self;
+
+    fn div(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_div_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(f32x4_div(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vdivq_f32(self.0, rhs.0) })
+            } else {
+                Self([
+                    self.0[0] / rhs.0[0],
+                    self.0[1] / rhs.0[1],
+                    self.0[2] / rhs.0[2],
+                    self.0[3] / rhs.0[3],
+                ])
+            }
+        }
+    }
+}
+
+impl core::ops::BitAnd for f32x4 {
+    type Output = Self;
+
+    #[inline(always)]
+    fn bitand(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_and_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(v128_and(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(cast(unsafe { vandq_u32(cast(self.0), cast(rhs.0)) }))
+            } else {
+                Self([
+                    f32::from_bits(self.0[0].to_bits() & rhs.0[0].to_bits()),
+                    f32::from_bits(self.0[1].to_bits() & rhs.0[1].to_bits()),
+                    f32::from_bits(self.0[2].to_bits() & rhs.0[2].to_bits()),
+                    f32::from_bits(self.0[3].to_bits() & rhs.0[3].to_bits()),
+                ])
+            }
+        }
+    }
+}
+
+impl core::ops::BitOr for f32x4 {
+    type Output = Self;
+
+    #[inline(always)]
+    fn bitor(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_or_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(v128_or(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(cast(unsafe { vorrq_u32(cast(self.0), cast(rhs.0)) }))
+            } else {
+                Self([
+                    f32::from_bits(self.0[0].to_bits() | rhs.0[0].to_bits()),
+                    f32::from_bits(self.0[1].to_bits() | rhs.0[1].to_bits()),
+                    f32::from_bits(self.0[2].to_bits() | rhs.0[2].to_bits()),
+                    f32::from_bits(self.0[3].to_bits() | rhs.0[3].to_bits()),
+                ])
+            }
+        }
+    }
+}
+
+impl core::ops::BitXor for f32x4 {
+    type Output = Self;
+
+    #[inline(always)]
+    fn bitxor(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_xor_ps(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(v128_xor(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(cast(unsafe { veorq_u32(cast(self.0), cast(rhs.0)) }))
+            } else {
+                Self([
+                    f32::from_bits(self.0[0].to_bits() ^ rhs.0[0].to_bits()),
+                    f32::from_bits(self.0[1].to_bits() ^ rhs.0[1].to_bits()),
+                    f32::from_bits(self.0[2].to_bits() ^ rhs.0[2].to_bits()),
+                    f32::from_bits(self.0[3].to_bits() ^ rhs.0[3].to_bits()),
+                ])
+            }
+        }
+    }
+}
+
+impl core::ops::Neg for f32x4 {
+    type Output = Self;
+
+    fn neg(self) -> Self {
+        Self::default() - self
+    }
+}
+
+impl core::ops::Not for f32x4 {
+    type Output = Self;
+
+    fn not(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                unsafe {
+                    let all_bits = _mm_set1_ps(f32::from_bits(u32::MAX));
+                    Self(_mm_xor_ps(self.0, all_bits))
+                }
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(v128_not(self.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(cast(unsafe { vmvnq_u32(cast(self.0)) }))
+            } else {
+                self ^ Self::splat(cast(u32::MAX))
+            }
+        }
+    }
+}
+
+impl core::cmp::PartialEq for f32x4 {
+    fn eq(&self, rhs: &Self) -> bool {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                unsafe { _mm_movemask_ps(_mm_cmpeq_ps(self.0, rhs.0)) == 0b1111 }
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                unsafe { vminvq_u32(vceqq_f32(self.0, rhs.0)) != 0 }
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                u32x4_all_true(f32x4_eq(self.0, rhs.0))
+            } else {
+                self.0 == rhs.0
+            }
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/src/wide/f32x8_t.rs
+++ b/third-party/vendor/tiny-skia/src/wide/f32x8_t.rs
@ -0,0 +1,403 @@
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+// Based on https://github.com/Lokathor/wide (Zlib)
+
+use bytemuck::cast;
+
+use super::{i32x8, u32x8};
+
+cfg_if::cfg_if! {
+    if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+        #[cfg(target_arch = "x86")]
+        use core::arch::x86::*;
+        #[cfg(target_arch = "x86_64")]
+        use core::arch::x86_64::*;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(32))]
+        pub struct f32x8(__m256);
+    } else {
+        use super::f32x4;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(32))]
+        pub struct f32x8(pub f32x4, pub f32x4);
+    }
+}
+
+unsafe impl bytemuck::Zeroable for f32x8 {}
+unsafe impl bytemuck::Pod for f32x8 {}
+
+impl Default for f32x8 {
+    fn default() -> Self {
+        Self::splat(0.0)
+    }
+}
+
+impl f32x8 {
+    pub fn splat(n: f32) -> Self {
+        cast([n, n, n, n, n, n, n, n])
+    }
+
+    pub fn floor(self) -> Self {
+        let roundtrip: f32x8 = cast(self.trunc_int().to_f32x8());
+        roundtrip
+            - roundtrip
+                .cmp_gt(self)
+                .blend(f32x8::splat(1.0), f32x8::default())
+    }
+
+    pub fn fract(self) -> Self {
+        self - self.floor()
+    }
+
+    pub fn normalize(self) -> Self {
+        self.max(f32x8::default()).min(f32x8::splat(1.0))
+    }
+
+    pub fn to_i32x8_bitcast(self) -> i32x8 {
+        bytemuck::cast(self)
+    }
+
+    pub fn to_u32x8_bitcast(self) -> u32x8 {
+        bytemuck::cast(self)
+    }
+
+    pub fn cmp_eq(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_cmp_ps(self.0, rhs.0, _CMP_EQ_OQ) })
+            } else {
+                Self(self.0.cmp_eq(rhs.0), self.1.cmp_eq(rhs.1))
+            }
+        }
+    }
+
+    pub fn cmp_ne(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_cmp_ps(self.0, rhs.0, _CMP_NEQ_OQ) })
+            } else {
+                Self(self.0.cmp_ne(rhs.0), self.1.cmp_ne(rhs.1))
+            }
+        }
+    }
+
+    pub fn cmp_ge(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_cmp_ps(self.0, rhs.0, _CMP_GE_OQ) })
+            } else {
+                Self(self.0.cmp_ge(rhs.0), self.1.cmp_ge(rhs.1))
+            }
+        }
+    }
+
+    pub fn cmp_gt(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_cmp_ps(self.0, rhs.0, _CMP_GT_OQ) })
+            } else {
+                Self(self.0.cmp_gt(rhs.0), self.1.cmp_gt(rhs.1))
+            }
+        }
+    }
+
+    pub fn cmp_le(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_cmp_ps(self.0, rhs.0, _CMP_LE_OQ) })
+            } else {
+                Self(self.0.cmp_le(rhs.0), self.1.cmp_le(rhs.1))
+            }
+        }
+    }
+
+    pub fn cmp_lt(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_cmp_ps(self.0, rhs.0, _CMP_LT_OQ) })
+            } else {
+                Self(self.0.cmp_lt(rhs.0), self.1.cmp_lt(rhs.1))
+            }
+        }
+    }
+
+    #[inline]
+    pub fn blend(self, t: Self, f: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_blendv_ps(f.0, t.0, self.0) })
+            } else {
+                Self(self.0.blend(t.0, f.0), self.1.blend(t.1, f.1))
+            }
+        }
+    }
+
+    pub fn abs(self) -> Self {
+        let non_sign_bits = f32x8::splat(f32::from_bits(i32::MAX as u32));
+        self & non_sign_bits
+    }
+
+    pub fn max(self, rhs: Self) -> Self {
+        // These technically don't have the same semantics for NaN and 0, but it
+        // doesn't seem to matter as Skia does it the same way.
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_max_ps(self.0, rhs.0) })
+            } else {
+                Self(self.0.max(rhs.0), self.1.max(rhs.1))
+            }
+        }
+    }
+
+    pub fn min(self, rhs: Self) -> Self {
+        // These technically don't have the same semantics for NaN and 0, but it
+        // doesn't seem to matter as Skia does it the same way.
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_min_ps(self.0, rhs.0) })
+            } else {
+                Self(self.0.min(rhs.0), self.1.min(rhs.1))
+            }
+        }
+    }
+
+    pub fn is_finite(self) -> Self {
+        let shifted_exp_mask = u32x8::splat(0xFF000000);
+        let u: u32x8 = cast(self);
+        let shift_u = u.shl::<1>();
+        let out = !(shift_u & shifted_exp_mask).cmp_eq(shifted_exp_mask);
+        cast(out)
+    }
+
+    pub fn round(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_round_ps(self.0, _MM_FROUND_NO_EXC | _MM_FROUND_TO_NEAREST_INT) })
+            } else {
+                Self(self.0.round(), self.1.round())
+            }
+        }
+    }
+
+    pub fn round_int(self) -> i32x8 {
+        // These technically don't have the same semantics for NaN and out of
+        // range values, but it doesn't seem to matter as Skia does it the same
+        // way.
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                cast(unsafe { _mm256_cvtps_epi32(self.0) })
+            } else {
+                i32x8(self.0.round_int(), self.1.round_int())
+            }
+        }
+    }
+
+    pub fn trunc_int(self) -> i32x8 {
+        // These technically don't have the same semantics for NaN and out of
+        // range values, but it doesn't seem to matter as Skia does it the same
+        // way.
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                cast(unsafe { _mm256_cvttps_epi32(self.0) })
+            } else {
+                i32x8(self.0.trunc_int(), self.1.trunc_int())
+            }
+        }
+    }
+
+    pub fn recip_fast(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_rcp_ps(self.0) })
+            } else {
+                Self(self.0.recip_fast(), self.1.recip_fast())
+            }
+        }
+    }
+
+    pub fn recip_sqrt(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_rsqrt_ps(self.0) })
+            } else {
+                Self(self.0.recip_sqrt(), self.1.recip_sqrt())
+            }
+        }
+    }
+
+    pub fn sqrt(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_sqrt_ps(self.0) })
+            } else {
+                Self(self.0.sqrt(), self.1.sqrt())
+            }
+        }
+    }
+}
+
+impl From<[f32; 8]> for f32x8 {
+    fn from(v: [f32; 8]) -> Self {
+        cast(v)
+    }
+}
+
+impl From<f32x8> for [f32; 8] {
+    fn from(v: f32x8) -> Self {
+        cast(v)
+    }
+}
+
+impl core::ops::Add for f32x8 {
+    type Output = Self;
+
+    fn add(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_add_ps(self.0, rhs.0) })
+            } else {
+                Self(self.0 + rhs.0, self.1 + rhs.1)
+            }
+        }
+    }
+}
+
+impl core::ops::AddAssign for f32x8 {
+    fn add_assign(&mut self, rhs: f32x8) {
+        *self = *self + rhs;
+    }
+}
+
+impl core::ops::Sub for f32x8 {
+    type Output = Self;
+
+    fn sub(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_sub_ps(self.0, rhs.0) })
+            } else {
+                Self(self.0 - rhs.0, self.1 - rhs.1)
+            }
+        }
+    }
+}
+
+impl core::ops::Mul for f32x8 {
+    type Output = Self;
+
+    fn mul(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_mul_ps(self.0, rhs.0) })
+            } else {
+                Self(self.0 * rhs.0, self.1 * rhs.1)
+            }
+        }
+    }
+}
+
+impl core::ops::MulAssign for f32x8 {
+    fn mul_assign(&mut self, rhs: f32x8) {
+        *self = *self * rhs;
+    }
+}
+
+impl core::ops::Div for f32x8 {
+    type Output = Self;
+
+    fn div(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_div_ps(self.0, rhs.0) })
+            } else {
+                Self(self.0 / rhs.0, self.1 / rhs.1)
+            }
+        }
+    }
+}
+
+impl core::ops::BitAnd for f32x8 {
+    type Output = Self;
+
+    #[inline(always)]
+    fn bitand(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_and_ps(self.0, rhs.0) })
+            } else {
+                Self(self.0 & rhs.0, self.1 & rhs.1)
+            }
+        }
+    }
+}
+
+impl core::ops::BitOr for f32x8 {
+    type Output = Self;
+
+    #[inline(always)]
+    fn bitor(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_or_ps(self.0, rhs.0) })
+            } else {
+                Self(self.0 | rhs.0, self.1 | rhs.1)
+            }
+        }
+    }
+}
+
+impl core::ops::BitXor for f32x8 {
+    type Output = Self;
+
+    #[inline(always)]
+    fn bitxor(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                Self(unsafe { _mm256_xor_ps(self.0, rhs.0) })
+            } else {
+                Self(self.0 ^ rhs.0, self.1 ^ rhs.1)
+            }
+        }
+    }
+}
+
+impl core::ops::Neg for f32x8 {
+    type Output = Self;
+
+    fn neg(self) -> Self {
+        Self::default() - self
+    }
+}
+
+impl core::ops::Not for f32x8 {
+    type Output = Self;
+
+    fn not(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                let all_bits = unsafe { _mm256_set1_ps(f32::from_bits(u32::MAX)) };
+                Self(unsafe { _mm256_xor_ps(self.0, all_bits) })
+            } else {
+                Self(!self.0, !self.1)
+            }
+        }
+    }
+}
+
+impl core::cmp::PartialEq for f32x8 {
+    fn eq(&self, rhs: &Self) -> bool {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                let mask = unsafe { _mm256_cmp_ps(self.0, rhs.0, _CMP_EQ_OQ) };
+                unsafe { _mm256_movemask_ps(mask) == 0b1111_1111 }
+            } else {
+                self.0 == rhs.0 && self.1 == rhs.1
+            }
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/src/wide/i32x4_t.rs
+++ b/third-party/vendor/tiny-skia/src/wide/i32x4_t.rs
@ -0,0 +1,281 @@
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+// Based on https://github.com/Lokathor/wide (Zlib)
+
+use bytemuck::cast;
+
+use super::f32x4;
+
+cfg_if::cfg_if! {
+    if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+        #[cfg(target_arch = "x86")]
+        use core::arch::x86::*;
+        #[cfg(target_arch = "x86_64")]
+        use core::arch::x86_64::*;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(16))]
+        pub struct i32x4(pub __m128i);
+    } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+        use core::arch::wasm32::*;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(16))]
+        pub struct i32x4(pub v128);
+    } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+        use core::arch::aarch64::*;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(16))]
+        pub struct i32x4(pub int32x4_t);
+    } else {
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(16))]
+        pub struct i32x4([i32; 4]);
+    }
+}
+
+unsafe impl bytemuck::Zeroable for i32x4 {}
+unsafe impl bytemuck::Pod for i32x4 {}
+
+impl Default for i32x4 {
+    fn default() -> Self {
+        Self::splat(0)
+    }
+}
+
+impl i32x4 {
+    pub fn splat(n: i32) -> Self {
+        cast([n, n, n, n])
+    }
+
+    pub fn blend(self, t: Self, f: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse4.1"))] {
+                Self(unsafe { _mm_blendv_epi8(f.0, t.0, self.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(v128_bitselect(t.0, f.0, self.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vbslq_s32(cast(self.0), t.0, f.0) })
+            } else {
+                super::generic_bit_blend(self, t, f)
+            }
+        }
+    }
+
+    pub fn cmp_eq(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                cast(Self(cast(unsafe { _mm_cmpeq_epi32(self.0, rhs.0) })))
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(i32x4_eq(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { cast(vceqq_s32(self.0, rhs.0)) })
+            } else {
+                Self([
+                    if self.0[0] == rhs.0[0] { -1 } else { 0 },
+                    if self.0[1] == rhs.0[1] { -1 } else { 0 },
+                    if self.0[2] == rhs.0[2] { -1 } else { 0 },
+                    if self.0[3] == rhs.0[3] { -1 } else { 0 },
+                ])
+            }
+        }
+    }
+
+    pub fn cmp_gt(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                cast(Self(cast(unsafe { _mm_cmpgt_epi32(self.0, rhs.0) })))
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(i32x4_gt(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { cast(vcgtq_s32(self.0, rhs.0)) })
+            } else {
+                Self([
+                    if self.0[0] > rhs.0[0] { -1 } else { 0 },
+                    if self.0[1] > rhs.0[1] { -1 } else { 0 },
+                    if self.0[2] > rhs.0[2] { -1 } else { 0 },
+                    if self.0[3] > rhs.0[3] { -1 } else { 0 },
+                ])
+            }
+        }
+    }
+
+    pub fn cmp_lt(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                cast(Self(cast(unsafe { _mm_cmplt_epi32(self.0, rhs.0) })))
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(i32x4_lt(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { cast(vcltq_s32(self.0, rhs.0)) })
+            } else {
+                Self([
+                    if self.0[0] < rhs.0[0] { -1 } else { 0 },
+                    if self.0[1] < rhs.0[1] { -1 } else { 0 },
+                    if self.0[2] < rhs.0[2] { -1 } else { 0 },
+                    if self.0[3] < rhs.0[3] { -1 } else { 0 },
+                ])
+            }
+        }
+    }
+
+    pub fn to_f32x4(self) -> f32x4 {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                cast(Self(cast(unsafe { _mm_cvtepi32_ps(self.0) })))
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                cast(Self(f32x4_convert_i32x4(self.0)))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                cast(Self(unsafe { cast(vcvtq_f32_s32(self.0)) }))
+            } else {
+                let arr: [i32; 4] = cast(self);
+                cast([
+                    arr[0] as f32,
+                    arr[1] as f32,
+                    arr[2] as f32,
+                    arr[3] as f32,
+                ])
+            }
+        }
+    }
+
+    pub fn to_f32x4_bitcast(self) -> f32x4 {
+        bytemuck::cast(self)
+    }
+}
+
+impl From<[i32; 4]> for i32x4 {
+    fn from(v: [i32; 4]) -> Self {
+        cast(v)
+    }
+}
+
+impl From<i32x4> for [i32; 4] {
+    fn from(v: i32x4) -> Self {
+        cast(v)
+    }
+}
+
+impl core::ops::Add for i32x4 {
+    type Output = Self;
+
+    fn add(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_add_epi32(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(i32x4_add(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vaddq_s32(self.0, rhs.0) })
+            } else {
+                Self([
+                    self.0[0].wrapping_add(rhs.0[0]),
+                    self.0[1].wrapping_add(rhs.0[1]),
+                    self.0[2].wrapping_add(rhs.0[2]),
+                    self.0[3].wrapping_add(rhs.0[3]),
+                ])
+            }
+        }
+    }
+}
+
+impl core::ops::BitAnd for i32x4 {
+    type Output = Self;
+
+    fn bitand(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_and_si128(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(v128_and(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vandq_s32(self.0, rhs.0) })
+            } else {
+                Self([
+                    self.0[0] & rhs.0[0],
+                    self.0[1] & rhs.0[1],
+                    self.0[2] & rhs.0[2],
+                    self.0[3] & rhs.0[3],
+                ])
+            }
+        }
+    }
+}
+
+impl core::ops::Mul for i32x4 {
+    type Output = Self;
+
+    fn mul(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse4.1"))] {
+                Self(unsafe { _mm_mullo_epi32(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(i32x4_mul(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vmulq_s32(self.0, rhs.0) })
+            } else {
+                // Cast is required, since we have to use scalar multiplication on SSE2.
+                let a: [i32; 4] = cast(self);
+                let b: [i32; 4] = cast(rhs);
+                Self(cast([
+                    a[0].wrapping_mul(b[0]),
+                    a[1].wrapping_mul(b[1]),
+                    a[2].wrapping_mul(b[2]),
+                    a[3].wrapping_mul(b[3]),
+                ]))
+            }
+        }
+    }
+}
+
+impl core::ops::BitOr for i32x4 {
+    type Output = Self;
+
+    #[inline]
+    fn bitor(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_or_si128(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(v128_or(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vorrq_s32(self.0, rhs.0) })
+            } else {
+                Self([
+                    self.0[0] | rhs.0[0],
+                    self.0[1] | rhs.0[1],
+                    self.0[2] | rhs.0[2],
+                    self.0[3] | rhs.0[3],
+                ])
+            }
+        }
+    }
+}
+
+impl core::ops::BitXor for i32x4 {
+    type Output = Self;
+
+    #[inline]
+    fn bitxor(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_xor_si128(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(v128_xor(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { veorq_s32(self.0, rhs.0) })
+            } else {
+                Self([
+                    self.0[0] ^ rhs.0[0],
+                    self.0[1] ^ rhs.0[1],
+                    self.0[2] ^ rhs.0[2],
+                    self.0[3] ^ rhs.0[3],
+                ])
+            }
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/src/wide/i32x8_t.rs
+++ b/third-party/vendor/tiny-skia/src/wide/i32x8_t.rs
@ -0,0 +1,192 @@
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+// Based on https://github.com/Lokathor/wide (Zlib)
+
+use bytemuck::cast;
+
+use super::{f32x8, u32x8};
+
+cfg_if::cfg_if! {
+    if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+        #[cfg(target_arch = "x86")]
+        use core::arch::x86::*;
+        #[cfg(target_arch = "x86_64")]
+        use core::arch::x86_64::*;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(32))]
+        pub struct i32x8(__m256i);
+    } else {
+        use super::i32x4;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(32))]
+        pub struct i32x8(pub i32x4, pub i32x4);
+    }
+}
+
+unsafe impl bytemuck::Zeroable for i32x8 {}
+unsafe impl bytemuck::Pod for i32x8 {}
+
+impl Default for i32x8 {
+    fn default() -> Self {
+        Self::splat(0)
+    }
+}
+
+impl i32x8 {
+    pub fn splat(n: i32) -> Self {
+        cast([n, n, n, n, n, n, n, n])
+    }
+
+    pub fn blend(self, t: Self, f: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                Self(unsafe { _mm256_blendv_epi8(f.0, t.0, self.0) })
+            } else {
+                Self(self.0.blend(t.0, f.0), self.1.blend(t.1, f.1))
+            }
+        }
+    }
+
+    pub fn cmp_eq(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                Self(unsafe { _mm256_cmpeq_epi32(self.0, rhs.0) })
+            } else {
+                Self(self.0.cmp_eq(rhs.0), self.1.cmp_eq(rhs.1))
+            }
+        }
+    }
+
+    pub fn cmp_gt(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                Self(unsafe { _mm256_cmpgt_epi32(self.0, rhs.0) })
+            } else {
+                Self(self.0.cmp_gt(rhs.0), self.1.cmp_gt(rhs.1))
+            }
+        }
+    }
+
+    pub fn cmp_lt(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                // There is no `_mm256_cmpLT_epi32`, therefore we have to use
+                // `_mm256_cmpGT_epi32` and then invert the result.
+                let v = unsafe { _mm256_cmpgt_epi32(self.0, rhs.0) };
+                let all_bits = unsafe { _mm256_set1_epi16(-1) };
+                Self(unsafe { _mm256_xor_si256(v, all_bits) })
+            } else {
+                Self(self.0.cmp_lt(rhs.0), self.1.cmp_lt(rhs.1))
+            }
+        }
+    }
+
+    pub fn to_f32x8(self) -> f32x8 {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                cast(unsafe { _mm256_cvtepi32_ps(self.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "avx"))] {
+                cast([self.0.to_f32x4(), self.1.to_f32x4()])
+            } else {
+                f32x8(self.0.to_f32x4(), self.1.to_f32x4())
+            }
+        }
+    }
+
+    pub fn to_u32x8_bitcast(self) -> u32x8 {
+        bytemuck::cast(self)
+    }
+
+    pub fn to_f32x8_bitcast(self) -> f32x8 {
+        bytemuck::cast(self)
+    }
+}
+
+impl From<[i32; 8]> for i32x8 {
+    fn from(v: [i32; 8]) -> Self {
+        cast(v)
+    }
+}
+
+impl From<i32x8> for [i32; 8] {
+    fn from(v: i32x8) -> Self {
+        cast(v)
+    }
+}
+
+impl core::ops::Add for i32x8 {
+    type Output = Self;
+
+    fn add(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                Self(unsafe { _mm256_add_epi32(self.0, rhs.0) })
+            } else {
+                Self(self.0 + rhs.0, self.1 + rhs.1)
+            }
+        }
+    }
+}
+
+impl core::ops::BitAnd for i32x8 {
+    type Output = Self;
+
+    fn bitand(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                Self(unsafe { _mm256_and_si256(self.0, rhs.0) })
+            } else {
+                Self(self.0 & rhs.0, self.1 & rhs.1)
+            }
+        }
+    }
+}
+
+impl core::ops::Mul for i32x8 {
+    type Output = Self;
+
+    fn mul(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                Self(unsafe { _mm256_mullo_epi32(self.0, rhs.0) })
+            } else {
+                Self(self.0 * rhs.0, self.1 * rhs.1)
+            }
+        }
+    }
+}
+
+impl core::ops::BitOr for i32x8 {
+    type Output = Self;
+
+    #[inline]
+    fn bitor(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                Self(unsafe { _mm256_or_si256(self.0, rhs.0) })
+            } else {
+                Self(self.0 | rhs.0, self.1 | rhs.1)
+            }
+        }
+    }
+}
+
+impl core::ops::BitXor for i32x8 {
+    type Output = Self;
+
+    #[inline]
+    fn bitxor(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                Self(unsafe { _mm256_xor_si256(self.0, rhs.0) })
+            } else {
+                Self(self.0 ^ rhs.0, self.1 ^ rhs.1)
+            }
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/src/wide/mod.rs
+++ b/third-party/vendor/tiny-skia/src/wide/mod.rs
@ -0,0 +1,72 @@
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+// This module was written from scratch, therefore there is no Google copyright.
+
+// f32x16, i32x16 and u32x16 are implemented as [Tx8; 2] and not as [T; 16].
+// This way we still can use some SIMD.
+//
+// We doesn't use #[inline] that much in this module.
+// The compiler will inline most of the methods automatically.
+// The only exception is U16x16, were we have to force inlining,
+// otherwise the performance will be horrible.
+
+#![allow(non_camel_case_types)]
+
+mod f32x16_t;
+mod f32x4_t;
+mod f32x8_t;
+mod i32x4_t;
+mod i32x8_t;
+mod u16x16_t;
+mod u32x4_t;
+mod u32x8_t;
+
+pub use f32x16_t::f32x16;
+pub use f32x4_t::f32x4;
+pub use f32x8_t::f32x8;
+pub use i32x4_t::i32x4;
+pub use i32x8_t::i32x8;
+pub use tiny_skia_path::f32x2;
+pub use u16x16_t::u16x16;
+pub use u32x4_t::u32x4;
+pub use u32x8_t::u32x8;
+
+#[allow(dead_code)]
+#[inline]
+pub fn generic_bit_blend<T>(mask: T, y: T, n: T) -> T
+where
+    T: Copy + core::ops::BitXor<Output = T> + core::ops::BitAnd<Output = T>,
+{
+    n ^ ((n ^ y) & mask)
+}
+
+/// A faster and more forgiving f32 min/max implementation.
+///
+/// Unlike std one, we do not care about NaN.
+#[allow(dead_code)]
+pub trait FasterMinMax {
+    fn faster_min(self, rhs: f32) -> f32;
+    fn faster_max(self, rhs: f32) -> f32;
+}
+
+#[allow(dead_code)]
+impl FasterMinMax for f32 {
+    fn faster_min(self, rhs: f32) -> f32 {
+        if rhs < self {
+            rhs
+        } else {
+            self
+        }
+    }
+
+    fn faster_max(self, rhs: f32) -> f32 {
+        if self < rhs {
+            rhs
+        } else {
+            self
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/src/wide/u16x16_t.rs
+++ b/third-party/vendor/tiny-skia/src/wide/u16x16_t.rs
@ -0,0 +1,250 @@
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+// No need to use explicit 256bit AVX2 SIMD.
+// `-C target-cpu=native` will autovectorize it better than us.
+// Not even sure why explicit instructions are so slow...
+//
+// On ARM AArch64 we can actually get up to 2x performance boost by using SIMD.
+//
+// We also have to inline all the methods. They are pretty large,
+// but without the inlining the performance is plummeting.
+
+#[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))]
+use bytemuck::cast;
+#[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))]
+use core::arch::aarch64::uint16x8_t;
+
+#[allow(non_camel_case_types)]
+#[derive(Copy, Clone, PartialEq, Default, Debug)]
+pub struct u16x16(pub [u16; 16]);
+
+macro_rules! impl_u16x16_op {
+    ($a:expr, $op:ident, $b:expr) => {
+        u16x16([
+            $a.0[0].$op($b.0[0]),
+            $a.0[1].$op($b.0[1]),
+            $a.0[2].$op($b.0[2]),
+            $a.0[3].$op($b.0[3]),
+            $a.0[4].$op($b.0[4]),
+            $a.0[5].$op($b.0[5]),
+            $a.0[6].$op($b.0[6]),
+            $a.0[7].$op($b.0[7]),
+            $a.0[8].$op($b.0[8]),
+            $a.0[9].$op($b.0[9]),
+            $a.0[10].$op($b.0[10]),
+            $a.0[11].$op($b.0[11]),
+            $a.0[12].$op($b.0[12]),
+            $a.0[13].$op($b.0[13]),
+            $a.0[14].$op($b.0[14]),
+            $a.0[15].$op($b.0[15]),
+        ])
+    };
+}
+
+#[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))]
+macro_rules! impl_aarch64_call {
+    ($f:ident, $a:expr, $b:expr) => {
+        let a = $a.split();
+        let b = $b.split();
+        Self(bytemuck::cast([
+            unsafe { core::arch::aarch64::$f(a.0, b.0) },
+            unsafe { core::arch::aarch64::$f(a.1, b.1) },
+        ]))
+    };
+}
+
+impl u16x16 {
+    #[inline]
+    pub fn splat(n: u16) -> Self {
+        Self([n, n, n, n, n, n, n, n, n, n, n, n, n, n, n, n])
+    }
+
+    #[inline]
+    pub fn as_slice(&self) -> &[u16; 16] {
+        &self.0
+    }
+
+    #[inline]
+    pub fn min(&self, rhs: &Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                impl_aarch64_call!(vminq_u16, self, rhs)
+            } else {
+                impl_u16x16_op!(self, min, rhs)
+            }
+        }
+    }
+
+    #[inline]
+    pub fn max(&self, rhs: &Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                impl_aarch64_call!(vmaxq_u16, self, rhs)
+            } else {
+                impl_u16x16_op!(self, max, rhs)
+            }
+        }
+    }
+
+    #[inline]
+    pub fn cmp_le(&self, rhs: &Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                impl_aarch64_call!(vcleq_u16, self, rhs)
+            } else {
+                Self([
+                    if self.0[ 0] <= rhs.0[ 0] { !0 } else { 0 },
+                    if self.0[ 1] <= rhs.0[ 1] { !0 } else { 0 },
+                    if self.0[ 2] <= rhs.0[ 2] { !0 } else { 0 },
+                    if self.0[ 3] <= rhs.0[ 3] { !0 } else { 0 },
+                    if self.0[ 4] <= rhs.0[ 4] { !0 } else { 0 },
+                    if self.0[ 5] <= rhs.0[ 5] { !0 } else { 0 },
+                    if self.0[ 6] <= rhs.0[ 6] { !0 } else { 0 },
+                    if self.0[ 7] <= rhs.0[ 7] { !0 } else { 0 },
+                    if self.0[ 8] <= rhs.0[ 8] { !0 } else { 0 },
+                    if self.0[ 9] <= rhs.0[ 9] { !0 } else { 0 },
+                    if self.0[10] <= rhs.0[10] { !0 } else { 0 },
+                    if self.0[11] <= rhs.0[11] { !0 } else { 0 },
+                    if self.0[12] <= rhs.0[12] { !0 } else { 0 },
+                    if self.0[13] <= rhs.0[13] { !0 } else { 0 },
+                    if self.0[14] <= rhs.0[14] { !0 } else { 0 },
+                    if self.0[15] <= rhs.0[15] { !0 } else { 0 },
+                ])
+            }
+        }
+    }
+
+    #[inline]
+    pub fn blend(self, t: Self, e: Self) -> Self {
+        (t & self) | (e & !self)
+    }
+
+    #[inline]
+    #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))]
+    pub fn split(self) -> (uint16x8_t, uint16x8_t) {
+        let pair: [uint16x8_t; 2] = cast(self.0);
+        (pair[0], pair[1])
+    }
+}
+
+impl core::ops::Add<u16x16> for u16x16 {
+    type Output = Self;
+
+    #[inline]
+    fn add(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                impl_aarch64_call!(vaddq_u16, self, rhs)
+            } else {
+                impl_u16x16_op!(self, add, rhs)
+            }
+        }
+    }
+}
+
+impl core::ops::Sub<u16x16> for u16x16 {
+    type Output = Self;
+
+    #[inline]
+    fn sub(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                impl_aarch64_call!(vsubq_u16, self, rhs)
+            } else {
+                impl_u16x16_op!(self, sub, rhs)
+            }
+        }
+    }
+}
+
+impl core::ops::Mul<u16x16> for u16x16 {
+    type Output = Self;
+
+    #[inline]
+    fn mul(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                impl_aarch64_call!(vmulq_u16, self, rhs)
+            } else {
+                impl_u16x16_op!(self, mul, rhs)
+            }
+        }
+    }
+}
+
+impl core::ops::Div<u16x16> for u16x16 {
+    type Output = Self;
+
+    #[inline]
+    fn div(self, rhs: Self) -> Self::Output {
+        impl_u16x16_op!(self, div, rhs)
+    }
+}
+
+impl core::ops::BitAnd<u16x16> for u16x16 {
+    type Output = Self;
+
+    #[inline]
+    fn bitand(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                impl_aarch64_call!(vandq_u16, self, rhs)
+            } else {
+                impl_u16x16_op!(self, bitand, rhs)
+            }
+        }
+    }
+}
+
+impl core::ops::BitOr<u16x16> for u16x16 {
+    type Output = Self;
+
+    #[inline]
+    fn bitor(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                impl_aarch64_call!(vorrq_u16, self, rhs)
+            } else {
+                impl_u16x16_op!(self, bitor, rhs)
+            }
+        }
+    }
+}
+
+impl core::ops::Not for u16x16 {
+    type Output = Self;
+
+    #[inline]
+    fn not(self) -> Self::Output {
+        u16x16([
+            !self.0[0],
+            !self.0[1],
+            !self.0[2],
+            !self.0[3],
+            !self.0[4],
+            !self.0[5],
+            !self.0[6],
+            !self.0[7],
+            !self.0[8],
+            !self.0[9],
+            !self.0[10],
+            !self.0[11],
+            !self.0[12],
+            !self.0[13],
+            !self.0[14],
+            !self.0[15],
+        ])
+    }
+}
+
+impl core::ops::Shr for u16x16 {
+    type Output = Self;
+
+    #[inline]
+    fn shr(self, rhs: Self) -> Self::Output {
+        impl_u16x16_op!(self, shr, rhs)
+    }
+}
--- a/third-party/vendor/tiny-skia/src/wide/u32x4_t.rs
+++ b/third-party/vendor/tiny-skia/src/wide/u32x4_t.rs
@ -0,0 +1,191 @@
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+// Based on https://github.com/Lokathor/wide (Zlib)
+
+cfg_if::cfg_if! {
+    if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+        #[cfg(target_arch = "x86")]
+        use core::arch::x86::*;
+        #[cfg(target_arch = "x86_64")]
+        use core::arch::x86_64::*;
+
+        // unused when AVX is available
+        #[cfg(not(target_feature = "avx2"))]
+        use bytemuck::cast;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(16))]
+        pub struct u32x4(__m128i);
+    } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+        use core::arch::wasm32::*;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(16))]
+        pub struct u32x4(v128);
+    } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+        use core::arch::aarch64::*;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(16))]
+        pub struct u32x4(uint32x4_t);
+    } else {
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(16))]
+        pub struct u32x4([u32; 4]);
+    }
+}
+
+unsafe impl bytemuck::Zeroable for u32x4 {}
+unsafe impl bytemuck::Pod for u32x4 {}
+
+impl Default for u32x4 {
+    fn default() -> Self {
+        Self::splat(0)
+    }
+}
+
+impl u32x4 {
+    pub fn splat(n: u32) -> Self {
+        bytemuck::cast([n, n, n, n])
+    }
+
+    // unused when AVX is available
+    #[cfg(not(target_feature = "avx2"))]
+    pub fn cmp_eq(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_cmpeq_epi32(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(u32x4_eq(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vceqq_u32(self.0, rhs.0) })
+            } else {
+                Self([
+                    if self.0[0] == rhs.0[0] { u32::MAX } else { 0 },
+                    if self.0[1] == rhs.0[1] { u32::MAX } else { 0 },
+                    if self.0[2] == rhs.0[2] { u32::MAX } else { 0 },
+                    if self.0[3] == rhs.0[3] { u32::MAX } else { 0 },
+                ])
+            }
+        }
+    }
+
+    // unused when AVX is available
+    #[cfg(not(target_feature = "avx2"))]
+    pub fn shl<const RHS: i32>(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                let shift = cast([RHS as u64, 0]);
+                Self(unsafe { _mm_sll_epi32(self.0, shift) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(u32x4_shl(self.0, RHS as _))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vshlq_n_u32::<RHS>(self.0) })
+            } else {
+                let u = RHS as u64;
+                Self([
+                    self.0[0] << u,
+                    self.0[1] << u,
+                    self.0[2] << u,
+                    self.0[3] << u,
+                ])
+            }
+        }
+    }
+
+    // unused when AVX is available
+    #[cfg(not(target_feature = "avx2"))]
+    pub fn shr<const RHS: i32>(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                let shift: __m128i = cast([RHS as u64, 0]);
+                Self(unsafe { _mm_srl_epi32(self.0, shift) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(u32x4_shr(self.0, RHS as _))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vshrq_n_u32::<RHS>(self.0) })
+            } else {
+                let u = RHS as u64;
+                Self([
+                    self.0[0] >> u,
+                    self.0[1] >> u,
+                    self.0[2] >> u,
+                    self.0[3] >> u,
+                ])
+            }
+        }
+    }
+}
+
+impl core::ops::Not for u32x4 {
+    type Output = Self;
+
+    fn not(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                let all_bits = unsafe { _mm_set1_epi32(-1) };
+                Self(unsafe { _mm_xor_si128(self.0, all_bits) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(v128_not(self.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vmvnq_u32(self.0) })
+            } else {
+                Self([
+                    !self.0[0],
+                    !self.0[1],
+                    !self.0[2],
+                    !self.0[3],
+                ])
+            }
+        }
+    }
+}
+
+impl core::ops::Add for u32x4 {
+    type Output = Self;
+
+    fn add(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_add_epi32(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(u32x4_add(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vaddq_u32(self.0, rhs.0) })
+            } else {
+                Self([
+                    self.0[0].wrapping_add(rhs.0[0]),
+                    self.0[1].wrapping_add(rhs.0[1]),
+                    self.0[2].wrapping_add(rhs.0[2]),
+                    self.0[3].wrapping_add(rhs.0[3]),
+                ])
+            }
+        }
+    }
+}
+
+impl core::ops::BitAnd for u32x4 {
+    type Output = Self;
+
+    fn bitand(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "sse2"))] {
+                Self(unsafe { _mm_and_si128(self.0, rhs.0) })
+            } else if #[cfg(all(feature = "simd", target_feature = "simd128"))] {
+                Self(v128_and(self.0, rhs.0))
+            } else if #[cfg(all(feature = "simd", target_arch = "aarch64", target_feature = "neon"))] {
+                Self(unsafe { vandq_u32(self.0, rhs.0) })
+            } else {
+                Self([
+                    self.0[0] & rhs.0[0],
+                    self.0[1] & rhs.0[1],
+                    self.0[2] & rhs.0[2],
+                    self.0[3] & rhs.0[3],
+                ])
+            }
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/src/wide/u32x8_t.rs
+++ b/third-party/vendor/tiny-skia/src/wide/u32x8_t.rs
@ -0,0 +1,127 @@
+// Copyright 2020 Yevhenii Reizner
+//
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+// Based on https://github.com/Lokathor/wide (Zlib)
+
+use super::{f32x8, i32x8};
+
+cfg_if::cfg_if! {
+    if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+        #[cfg(target_arch = "x86")]
+        use core::arch::x86::*;
+        #[cfg(target_arch = "x86_64")]
+        use core::arch::x86_64::*;
+
+        use bytemuck::cast;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(32))]
+        pub struct u32x8(__m256i);
+    } else {
+        use super::u32x4;
+
+        #[derive(Clone, Copy, Debug)]
+        #[repr(C, align(32))]
+        pub struct u32x8(u32x4, u32x4);
+    }
+}
+
+unsafe impl bytemuck::Zeroable for u32x8 {}
+unsafe impl bytemuck::Pod for u32x8 {}
+
+impl Default for u32x8 {
+    fn default() -> Self {
+        Self::splat(0)
+    }
+}
+
+impl u32x8 {
+    pub fn splat(n: u32) -> Self {
+        bytemuck::cast([n, n, n, n, n, n, n, n])
+    }
+
+    pub fn to_i32x8_bitcast(self) -> i32x8 {
+        bytemuck::cast(self)
+    }
+
+    pub fn to_f32x8_bitcast(self) -> f32x8 {
+        bytemuck::cast(self)
+    }
+
+    pub fn cmp_eq(self, rhs: Self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                Self(unsafe { _mm256_cmpeq_epi32(self.0, rhs.0) })
+            } else {
+                Self(self.0.cmp_eq(rhs.0), self.1.cmp_eq(rhs.1))
+            }
+        }
+    }
+
+    pub fn shl<const RHS: i32>(self) -> Self {
+        cfg_if::cfg_if! {
+           if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                let shift: __m128i = cast([RHS as u64, 0]);
+                Self(unsafe { _mm256_sll_epi32(self.0, shift) })
+            } else {
+                Self(self.0.shl::<RHS>(), self.1.shl::<RHS>())
+            }
+        }
+    }
+
+    pub fn shr<const RHS: i32>(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                let shift: __m128i = cast([RHS as u64, 0]);
+                Self(unsafe { _mm256_srl_epi32(self.0, shift) })
+            } else {
+                Self(self.0.shr::<RHS>(), self.1.shr::<RHS>())
+            }
+        }
+    }
+}
+
+impl core::ops::Not for u32x8 {
+    type Output = Self;
+
+    fn not(self) -> Self {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                let all_bits = unsafe { _mm256_set1_epi16(-1) };
+                Self(unsafe { _mm256_xor_si256(self.0, all_bits) })
+            } else {
+                Self(!self.0, !self.1)
+            }
+        }
+    }
+}
+
+impl core::ops::Add for u32x8 {
+    type Output = Self;
+
+    fn add(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                Self(unsafe { _mm256_add_epi32(self.0, rhs.0) })
+            } else {
+                Self(self.0 + rhs.0, self.1 + rhs.1)
+            }
+        }
+    }
+}
+
+impl core::ops::BitAnd for u32x8 {
+    type Output = Self;
+
+    fn bitand(self, rhs: Self) -> Self::Output {
+        cfg_if::cfg_if! {
+            if #[cfg(all(feature = "simd", target_feature = "avx2"))] {
+                Self(unsafe { _mm256_and_si256(self.0, rhs.0) })
+            } else {
+                Self(self.0 & rhs.0, self.1 & rhs.1)
+            }
+        }
+    }
+}
--- a/third-party/vendor/tiny-skia/tests/README.md
+++ b/third-party/vendor/tiny-skia/tests/README.md
@ -0,0 +1 @@
+Tests that start with `skia_` are ports of the Skia tests.
--- a/third-party/vendor/tiny-skia/tests/images/canvas/draw-pixmap-opacity.png
+++ b/third-party/vendor/tiny-skia/tests/images/canvas/draw-pixmap-opacity.png
--- a/third-party/vendor/tiny-skia/tests/images/canvas/draw-pixmap-ts.png
+++ b/third-party/vendor/tiny-skia/tests/images/canvas/draw-pixmap-ts.png
--- a/third-party/vendor/tiny-skia/tests/images/canvas/draw-pixmap.png
+++ b/third-party/vendor/tiny-skia/tests/images/canvas/draw-pixmap.png
--- a/third-party/vendor/tiny-skia/tests/images/canvas/fill-rect.png
+++ b/third-party/vendor/tiny-skia/tests/images/canvas/fill-rect.png
--- a/third-party/vendor/tiny-skia/tests/images/clip/circle-bottom-right-aa.png
+++ b/third-party/vendor/tiny-skia/tests/images/clip/circle-bottom-right-aa.png
--- a/third-party/vendor/tiny-skia/tests/images/clip/ignore-memset.png
+++ b/third-party/vendor/tiny-skia/tests/images/clip/ignore-memset.png
--- a/third-party/vendor/tiny-skia/tests/images/clip/ignore-source.png
+++ b/third-party/vendor/tiny-skia/tests/images/clip/ignore-source.png
--- a/third-party/vendor/tiny-skia/tests/images/clip/intersect-aa.png
+++ b/third-party/vendor/tiny-skia/tests/images/clip/intersect-aa.png
--- a/third-party/vendor/tiny-skia/tests/images/clip/rect-aa.png
+++ b/third-party/vendor/tiny-skia/tests/images/clip/rect-aa.png
--- a/third-party/vendor/tiny-skia/tests/images/clip/rect-ts.png
+++ b/third-party/vendor/tiny-skia/tests/images/clip/rect-ts.png
--- a/third-party/vendor/tiny-skia/tests/images/clip/rect.png
+++ b/third-party/vendor/tiny-skia/tests/images/clip/rect.png
--- a/third-party/vendor/tiny-skia/tests/images/clip/skip-dest.png
+++ b/third-party/vendor/tiny-skia/tests/images/clip/skip-dest.png
--- a/third-party/vendor/tiny-skia/tests/images/clip/stroke.png
+++ b/third-party/vendor/tiny-skia/tests/images/clip/stroke.png
--- a/third-party/vendor/tiny-skia/tests/images/dash/closed.png
+++ b/third-party/vendor/tiny-skia/tests/images/dash/closed.png
--- a/third-party/vendor/tiny-skia/tests/images/dash/complex.png
+++ b/third-party/vendor/tiny-skia/tests/images/dash/complex.png
--- a/third-party/vendor/tiny-skia/tests/images/dash/cubic.png
+++ b/third-party/vendor/tiny-skia/tests/images/dash/cubic.png
--- a/third-party/vendor/tiny-skia/tests/images/dash/hairline.png
+++ b/third-party/vendor/tiny-skia/tests/images/dash/hairline.png
--- a/third-party/vendor/tiny-skia/tests/images/dash/line.png
+++ b/third-party/vendor/tiny-skia/tests/images/dash/line.png
--- a/third-party/vendor/tiny-skia/tests/images/dash/multi_subpaths.png
+++ b/third-party/vendor/tiny-skia/tests/images/dash/multi_subpaths.png
--- a/third-party/vendor/tiny-skia/tests/images/dash/quad.png
+++ b/third-party/vendor/tiny-skia/tests/images/dash/quad.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/clear-aa.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/clear-aa.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/clip-cubic-1.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/clip-cubic-1.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/clip-cubic-2.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/clip-cubic-2.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/clip-line-1.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/clip-line-1.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/clip-line-2.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/clip-line-2.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/clip-quad.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/clip-quad.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/cubic.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/cubic.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/empty.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/empty.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/even-odd-star.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/even-odd-star.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/float-rect-aa-highp.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/float-rect-aa-highp.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/float-rect-aa.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/float-rect-aa.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/float-rect-clip-bottom-right-aa.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/float-rect-clip-bottom-right-aa.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/float-rect-clip-top-left-aa.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/float-rect-clip-top-left-aa.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/float-rect-clip-top-right-aa.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/float-rect-clip-top-right-aa.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/float-rect.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/float-rect.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/int-rect-aa.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/int-rect-aa.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/int-rect-with-ts-clip-right.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/int-rect-with-ts-clip-right.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/int-rect.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/int-rect.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/memset2d-2.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/memset2d-2.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/memset2d.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/memset2d.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/polygon.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/polygon.png
--- a/third-party/vendor/tiny-skia/tests/images/fill/quad.png
+++ b/third-party/vendor/tiny-skia/tests/images/fill/quad.png
--- a/Show more
+++ b/Show more
				`@ -0,0 +1 @@`
				Tests that start with `skia_` are ports of the Skia tests.