PHP安全三板斧:转义篇Blade引擎避免XSS攻击原理

2016 年 6 月 15 日3570

　　PHP 转义实现

　　把输出渲染成网页或API响应时，一定要转义输出，这也是一种防护措施，能避免渲染恶意代码，造成XSS攻击，还能防止应用的用户无意中执行恶意代码。

　　我们可以使用前面提到的htmlentities函数转移输出，该函数的第二个参数一定要使用ENT_QUOTES，让这个函数转义单引号和双引号，而且，还要在第三个参数中指定合适的字符编码(通常是UTF-8)，下面的例子演示了如何在渲染前转义HTML输出：

　　';echo htmlentities($output, ENT_QUOTES, ‘UTF-8');

　　如果不转义直接输出，会弹出提示框：

　　转义之后输出变成：

　　现代PHP支持许多模板引擎，这些模板引擎在底层已经为了做好了转义处理，比如现在流行的twig/twig和smarty/smarty都会自动转义输出。这种默认处理方式很赞，为PHP Web应用提供了有力的安全保障。

　　Blade 模板引擎避免XSS攻击原理

　　Laravel使用的模板引擎是Blade，关于Blade的使用可以参考其官方文档，这里我们简单探讨下Laravel底层如何对输出进行转义处理。

　　一般我们在Laravel中返回视图内容会这么做：

　　return view(’test’, [‘data’=>$data]);

　　这是一个很简单的例子，意味着我们会在resources/views目录下找到test.blade.php视图文件，然后将$data变量传入其中，并将最终渲染结果作为响应的内容返回给用户。那么这一过程经历了哪些底层源码的处理，如果$data变量中包含脚本代码(如JavaScript脚本)，又该怎么去处理呢？接下来我们让来一窥究竟。

　　首先我们从辅助函数view入手，当然这里我们也可以使用View:make，但是简单起见，我们一般用view函数，该函数定义在Illuminate\Foundation\helpers.php文件中：

　　function view($view = null, $data = [], $mergeData = []){ $factory = app(ViewFactory::class); if (func_num_args() === 0) { return $factory; } return $factory->make($view, $data, $mergeData);}

　　该函数中的逻辑是从容器中取出视图工厂接口ViewFactory对应的实例$factory(该绑定关系在Illuminate\View\ViewServiceProvider的register方法中注册，此外这里还注册了模板引擎解析器EngineResolver，包括PhpEngine和载入BladeCompiler的CompilerEngine，以及视图文件查找器FileViewFinder，一句话，这里注册了视图解析所需的所有服务)，如果传入了参数，则调用$factory上的make方法：

　　public function make($view, $data = [], $mergeData = []){ if (isset($this->aliases[$view])) { $view = $this->aliases[$view]; } $view = $this->normalizeName($view); $path = $this->finder->find($view); $data = array_merge($mergeData, $this->parseData($data)); $this->callCreator($view = new View($this, $this->getEngineFromPath($path), $view, $path, $data)); return $view;}

　　这个方法位于Illuminate\View\Factory，这里所做的事情是获取视图文件的完整路径，合并传入变量，$this->getEngineFromPath会通过视图文件后缀获取相应的模板引擎，比如我们使用.blade.php结尾的视图文件则获得到的是CompilerEngine(即Blade模板引擎)，否则将获取到PhpEngine，然后我们根据相应参数实例化View(Illuminate\View\View)对象并返回。需要注意的是View类中重写了__toString方法：

　　public function __toString(){ return $this->render();}

　　所以当我们打印$view实例的时候，实际上会调用View类的render方法，所以下一步我们理所应当研究render方法做了些什么：

　　public function render(callable $callback = null){ try { $contents = $this->renderContents(); $response = isset($callback) ? call_user_func($callback, $this, $contents) : null; // Once we have the contents of the view, we will flush the sections if we are // done rendering all views so that there is nothing left hanging over when // another view gets rendered in the future by the application developer. $this->factory->flushSectionsIfDoneRendering(); return ! is_null($response) ? $response : $contents; } catch (Exception $e) { $this->factory->flushSections(); throw $e; } catch (Throwable $e) { $this->factory->flushSections(); throw $e; }}

　　这里重点是$this->renderContents()方法，我们继续深入研究View类中的renderContents方法：

　　protected function renderContents(){ // We will keep track of the amount of views being rendered so we can flush // the section after the complete rendering operation is done. This will // clear out the sections for any separate views that may be rendered. $this->factory->incrementRender(); $this->factory->callComposer($this); $contents = $this->getContents(); // Once we've finished rendering the view, we'll decrement the render count // so that each sections get flushed out next time a view is created and // no old sections are staying around in the memory of an environment. $this->factory->decrementRender(); return $contents;}

　　我们重点关注$this->getContents()这里，进入getContents方法：

　　protected function getContents(){ return $this->engine->get($this->path, $this->gatherData());}

　　我们在前面已经提到，这里的$this->engine对应CompilerEngine(Illuminate\View\Engines\CompilerEngine)，所以我们进入CompilerEngine的get方法：

　　public function get($path, array $data = []){ $this->lastCompiled[] = $path; // If this given view has expired, which means it has simply been edited since // it was last compiled, we will re-compile the views so we can evaluate a // fresh copy of the view. We'll pass the compiler the path of the view. if ($this->compiler->isExpired($path)) { $this->compiler->compile($path); } $compiled = $this->compiler->getCompiledPath($path); // Once we have the path to the compiled file, we will evaluate the paths with // typical PHP just like any other templates. We also keep a stack of views // which have been rendered for right exception messages to be generated. $results = $this->evaluatePath($compiled, $data); array_pop($this->lastCompiled); return $results;}

　　同样我们在之前提到，CompilerEngine使用的compiler是BladeCompiler，所以$this->compiler也就是Blade编译器，我们先看$this->compiler->compile($path);这一行(首次运行或者编译好的视图模板已过期会进这里)，进入BladeCompiler的compile方法：

　　public function compile($path = null){ if ($path) { $this->setPath($path); } if (! is_null($this->cachePath)) { $contents = $this->compileString($this->files->get($this->getPath())); $this->files->put($this->getCompiledPath($this->getPath()), $contents); }}

　　这里我们做的事情是先编译视图文件内容，然后将编译好的内容存放到视图编译路径(storage\framework\views)下对应的文件(一次编译，多次运行，以提高性能)，这里我们重点关注的是$this->compileString方法，该方法中使用了token_get_all函数将视图文件代码分割成多个片段，如果片段是数组的话则循环调用$this->parseToken方法：

　　protected function parseToken($token){ list($id, $content) = $token; if ($id == T_INLINE_HTML) { foreach ($this->compilers as $type) { $content = $this->{"compile{$type}"}($content); } } return $content;}

　　来到这里，我们已经很接近真相了，针对HTML代码(含Blade指令代码)，循环调用compileExtensions、compileStatements、compileComments和compileEchos方法，我们重点关注输出方法compileEchos，Blade引擎默认提供了compileRawEchos、compileEscapedEchos和compileRegularEchos三种输出方法，对应的指令分别是{!! !!}、{{{ }}}和{{ }}，顾名思义，compileRawEchos对应的是原生输出：

　　protected function compileRawEchos($value){ $pattern = sprintf('/(@)?%s\s*(.+?)\s*%s(\r?\n)?/s', $this->rawTags[0], $this->rawTags[1]); $callback = function ($matches) { $whitespace = empty($matches[3]) ? '' : $matches[3].$matches[3]; return $matches[1] ? substr($matches[0], 1) : 'compileEchoDefaults($matches[2]).'; ?>'.$whitespace; }; return preg_replace_callback($pattern, $callback, $value);}

　　即Blade视图中以{!! !!}包裹的变量会原生输出HTML，如果要显示图片、链接，推荐这种方式。

　　{{{}}}对应的CompileEscapedEchos，这个在Laravel 4.2及以前版本中用于转义，现在已经替换成了{{}}，即调用compileRegularEchos方法：

　　protected function compileRegularEchos($value){ $pattern = sprintf('/(@)?%s\s*(.+?)\s*%s(\r?\n)?/s', $this->contentTags[0], $this->contentTags[1]); $callback = function ($matches) { $whitespace = empty($matches[3]) ? '' : $matches[3].$matches[3]; $wrapped = sprintf($this->echoFormat, $this->compileEchoDefaults($matches[2])); return $matches[1] ? substr($matches[0], 1) : ''.$whitespace; }; return preg_replace_callback($pattern, $callback, $value);}

　　其中$this->echoFormat对应e(%s)，无独有偶，compileEscapedEchos中也用到这个方法：

　　protected function compileEscapedEchos($value){ $pattern = sprintf('/(@)?%s\s*(.+?)\s*%s(\r?\n)?/s', $this->escapedTags[0], $this->escapedTags[1]); $callback = function ($matches) { $whitespace = empty($matches[3]) ? '' : $matches[3].$matches[3]; return $matches[1] ? $matches[0] : 'compileEchoDefaults($matches[2]).'); ?>'.$whitespace; }; return preg_replace_callback($pattern, $callback, $value);}

　　辅助函数e()定义在Illuminate\Support\helpers.php中：

　　function e($value){ if ($value instanceof Htmlable) { return $value->toHtml(); } return htmlentities($value, ENT_QUOTES, 'UTF-8', false);}

　　其作用就是对输入的值进行转义。

　　经过这样的转义，视图中的{{ $data }}或被编译成，最终如何将$data传入视图输出，我们再回到CompilerEngine的get方法，看这一段：

　　$results = $this->evaluatePath($compiled, $data);

　　evaluatePath中传入了编译后的视图文件路径和传入的变量$data，该方法定义如下：

　　protected function evaluatePath($__path, $__data){ $obLevel = ob_get_level();ob_start(); extract($__data, EXTR_SKIP); // We'll evaluate the contents of the view inside a try/catch block so we can // flush out any stray output that might get out before an error occurs or // an exception is thrown. This prevents any partial views from leaking. try { include $__path; } catch (Exception $e) { $this->handleViewException($e, $obLevel); } catch (Throwable $e) { $this->handleViewException(new FatalThrowableError($e), $obLevel); } return ltrim(ob_get_clean());}

　　这里面调用了PHP系统函数extract将传入变量从数组中导入当前符号表(通过include $__path引入)，其作用也就是将编译后视图文件中的变量悉数替换成传入的变量值(通过键名映射)。

　　好了，这就是Blade视图模板从渲染到输出的基本过程，可以看到我们通过{{}}来转义输出，从而达到避免XSS攻击的目的。

0 0

« wdcp PHP安全三板斧：验证篇Laravel底层字段验证实现 »